Fix clearing of sleep timeouts

2022-06-04 Thread Visa Hankala
Encountered the following panic:

panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" failed: file 
"/usr/src/sys/kern/kern_synch.c", line 373
Stopped at  db_enter+0x10:  popq%rbp
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
 423109  57118 55 0x3  02  link
 330695  30276 55 0x3  03  link
* 46366  85501 55  0x1003  0x40804001  link
 188803  85501 55  0x1003  0x40820000K link
db_enter() at db_enter+0x10
panic(81f25d2b) at panic+0xbf
__assert(81f9a186,81f372c8,175,81f87c6c) at 
__assert+0x25
sleep_setup(800022d64bf8,800022d64c98,20,81f66ac6,0) at 
sleep_setup+0x1d8
cond_wait(800022d64c98,81f66ac6) at cond_wait+0x46
timeout_barrier(8000228a28b0) at timeout_barrier+0x109
timeout_del_barrier(8000228a28b0) at timeout_del_barrier+0xa2
sleep_finish(800022d64d90,1) at sleep_finish+0x16d
tsleep(823a5130,120,81f0b730,2) at tsleep+0xb2
sys_nanosleep(8000228a27f0,800022d64ea0,800022d64ef0) at 
sys_nanosleep+0x12d
syscall(800022d64f60) at syscall+0x374

The panic is a regression of sys/kern/kern_timeout.c r1.84. Previously,
soft-interrupt-driven timeouts could be deleted synchronously without
blocking. Now, timeout_del_barrier() can sleep regardless of the type
of the timeout.

It looks that with small adjustments timeout_del_barrier() can sleep
in sleep_finish(). The management of run queues is not affected because
the timeout clearing happens after it. As timeout_del_barrier() does not
rely on a timeout or signal catching, there should be no risk of
unbounded recursion or unwanted signal side effects within the sleep
machinery. In a way, a sleep with a timeout is higher-level than
one without.

Note that endtsleep() can run and set P_TIMEOUT during
timeout_del_barrier() when the thread is blocked in cond_wait().
To avoid unnecessary atomic read-modify-write operations, the clearing
of P_TIMEOUT could be conditional, but maybe that is an unnecessary
optimization at this point.

While it should be possible to make the code use timeout_del() instead
of timeout_del_barrier(), the outcome might not be outright better. For
example, sleep_setup() and endtsleep() would have to coordinate so that
a late-running timeout from previous sleep cycle would not disturb the
new cycle.

To test the barrier path reliably, I made the code call
timeout_del_barrier() twice in a row. The second call is guaranteed
to sleep. Of course, this is not part of the patch.

OK?

Index: kern/kern_synch.c
===
RCS file: src/sys/kern/kern_synch.c,v
retrieving revision 1.187
diff -u -p -r1.187 kern_synch.c
--- kern/kern_synch.c   13 May 2022 15:32:00 -  1.187
+++ kern/kern_synch.c   5 Jun 2022 05:04:45 -
@@ -370,8 +370,8 @@ sleep_setup(struct sleep_state *sls, con
p->p_slppri = prio & PRIMASK;
TAILQ_INSERT_TAIL([LOOKUP(ident)], p, p_runq);
 
-   KASSERT((p->p_flag & P_TIMEOUT) == 0);
if (timo) {
+   KASSERT((p->p_flag & P_TIMEOUT) == 0);
sls->sls_timeout = 1;
timeout_add(>p_sleep_to, timo);
}
@@ -432,13 +432,12 @@ sleep_finish(struct sleep_state *sls, in
 
if (sls->sls_timeout) {
if (p->p_flag & P_TIMEOUT) {
-   atomic_clearbits_int(>p_flag, P_TIMEOUT);
error1 = EWOULDBLOCK;
} else {
-   /* This must not sleep. */
+   /* This can sleep. It must not use timeouts. */
timeout_del_barrier(>p_sleep_to);
-   KASSERT((p->p_flag & P_TIMEOUT) == 0);
}
+   atomic_clearbits_int(>p_flag, P_TIMEOUT);
}
 
/* Check if thread was woken up because of a unwind or signal */



Re: pkg_add in -current

2022-06-04 Thread Theo de Raadt
Stuart Henderson  wrote:

> On 2022/06/04 15:23, Theo de Raadt wrote:
> > Stuart Henderson  wrote:
> > 
> > > If you are running -current and have not updated base recently, you
> > > may run inTO "pkg_add: Unknown option: always-update ".
> > > To fix it, just update to a newer base snapshot.
> > 
> > 
> > 
> > What happened is that a developer made a change to the pkg tools which
> > creates completely incompatible package files.
> > 
> > Noone knew this was happening beforehands.  An email was circulated
> > after-the-fact to ports@ list (which is mostly read by developers, and
> > not read by users).  It has now been weeks, and it still hasn't been
> > clearly communicated.
> 
> People can decide for themselves about that,
> 
> First commit enabling parsing in pkg_add
> https://github.com/openbsd/src/commit/5cb7aebf4211294fd2891b0bc45c383ab7fd66af

That commit message does not say:

 There will be no backwards compatiblity.

> "REMINDER: snapshots go with -current"
> https://marc.info/?l=openbsd-ports=165355109123377=2

That message says:

 There is zero effort being made for backwards compatiblity.

It also says it is going to be FUN.  Are we having fun?  We are not
having fun.  This is the case of one developer (who did not even
explain what was happening to any non-ports developer) making a decision
in their own bubble, without communicating the impact in a way that
everyone can understand.

> Second commit, after base is updated with this subsequent package builds
> use the new annotation
> https://github.com/openbsd/src/commit/c2e596a17ac45689d758df0d67597fef94480ebe

That commit message does not say:

 No effort has been made for backwards compatibility.

> (Then it takes time for new packages to be built on the various archs
> and it's not until *then* that errors would show up for people who
> haven't updated base yet)

So here we are:

There is no backwards compatibility, and users are starting to
encounter the problem, and the answer for them is that they must reboot.
No it's not just that, they are being told the PROCESS WAS GREAT, and
what is wrong here is *THEIR* process of using snapshots.

It has also been pointed out that current.html has no information about this
change.  I have been saying for a while we should delete current.html
because it seems to always contain useless information, and here we see
it lacks crucial information.



Re: pkg_add in -current

2022-06-04 Thread Stuart Henderson
On 2022/06/04 15:23, Theo de Raadt wrote:
> Stuart Henderson  wrote:
> 
> > If you are running -current and have not updated base recently, you
> > may run inTO "pkg_add: Unknown option: always-update ".
> > To fix it, just update to a newer base snapshot.
> 
> 
> 
> What happened is that a developer made a change to the pkg tools which
> creates completely incompatible package files.
> 
> Noone knew this was happening beforehands.  An email was circulated
> after-the-fact to ports@ list (which is mostly read by developers, and
> not read by users).  It has now been weeks, and it still hasn't been
> clearly communicated.

People can decide for themselves about that,

First commit enabling parsing in pkg_add
https://github.com/openbsd/src/commit/5cb7aebf4211294fd2891b0bc45c383ab7fd66af

"REMINDER: snapshots go with -current"
https://marc.info/?l=openbsd-ports=165355109123377=2

Second commit, after base is updated with this subsequent package builds
use the new annotation
https://github.com/openbsd/src/commit/c2e596a17ac45689d758df0d67597fef94480ebe

(Then it takes time for new packages to be built on the various archs
and it's not until *then* that errors would show up for people who
haven't updated base yet)



Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-04 Thread Hrvoje Popovski
On 4.6.2022. 21:23, Hrvoje Popovski wrote:
> Hi all,
> 
> I've put this diff in production on clean source from this morning and
> got panic. I'm not 100% sure if it's because of TSO because in a last
> monts i had all kinds of diffs on production boxes.
> Now I will run spanshot maybe clean spanshot will panic :))
> 
> I've couldn't trigger panic with TSO diff in lab ..

I think I've found something...

setup like "ix-aggr-vlan-ip", doesn't work as expected with tso
setups like "ix-aggr-ip" or "ix-vlan-ip" works as expected

sending tcp stream with tcpbech from box with clean snapshot to host
with "ix-aggr-vlan-ip" and tso

0.000 Peak Mbps:0.000 Avg Mbps:0.000
0.011 Peak Mbps:0.011 Avg Mbps:0.011
0.000 Peak Mbps:0.011 Avg Mbps:0.000
0.011 Peak Mbps:0.011 Avg Mbps:0.011
0.011 Peak Mbps:0.011 Avg Mbps:0.011
0.000 Peak Mbps:0.011 Avg Mbps:0.000
0.011 Peak Mbps:0.011 Avg Mbps:0.011
0.000 Peak Mbps:0.011 Avg Mbps:0.000
0.011 Peak Mbps:0.011 Avg Mbps:0.011


without tso

1075.227 Peak Mbps: 1075.227 Avg Mbps: 1075.227
2163.509 Peak Mbps: 2163.509 Avg Mbps: 2163.509
2147.523 Peak Mbps: 2163.509 Avg Mbps: 2147.523
2183.271 Peak Mbps: 2183.271 Avg Mbps: 2183.271
2690.801 Peak Mbps: 2690.801 Avg Mbps: 2690.801
2602.508 Peak Mbps: 2690.801 Avg Mbps: 2602.508
3036.190 Peak Mbps: 3036.190 Avg Mbps: 3036.190
2799.911 Peak Mbps: 3036.190 Avg Mbps: 2799.911



Re: pkg_add in -current

2022-06-04 Thread Theo de Raadt
Stuart Henderson  wrote:

> If you are running -current and have not updated base recently, you
> may run inTO "pkg_add: Unknown option: always-update ".
> To fix it, just update to a newer base snapshot.



What happened is that a developer made a change to the pkg tools which
creates completely incompatible package files.

Noone knew this was happening beforehands.  An email was circulated
after-the-fact to ports@ list (which is mostly read by developers, and
not read by users).  It has now been weeks, and it still hasn't been
clearly communicated.

We break compatibility often, but generally ensure the right people --
both developers and users -- know when they need to know.  This is
important because people who follow snapshots (in various ways) should
have a good experience because if they don't enjoy the snapshot
experience, we may end up with a smaller test community between
releases.

Sometimes there are surprises in snapshots of a testing nature, but this
particular change was not deployed or communicated as a test (we cannot
go back).

The normal model was not followed in this instance.



pkg_add in -current

2022-06-04 Thread Stuart Henderson
If you are running -current and have not updated base recently, you
may run inTO "pkg_add: Unknown option: always-update ".
To fix it, just update to a newer base snapshot.




Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-04 Thread Hrvoje Popovski
On 1.6.2022. 11:21, Jan Klemkow wrote:
> I moved the switch to ifconfig(8) in the diff below.
> 
> # ifconfig ix0 tso
> # ifconfig ix0 -tso
> 
> I named it tso (TCP segment offloading), so I can reuse this switch
> also for the sending part.  TSO is the combination of LRO and LSO.
> 
> LRO: Large Receive Offloading
> LSO: Large Send Offloading
> 
> RSC (Receive Side Coalescing) is an alternative term for LRO, which is
> used by the spec of ix(4) NICs.
> 
>>> Tests with other ix(4) NICs are welcome and needed!
>> We'll try and kick it around at work in the next week or so.

Hi all,

I've put this diff in production on clean source from this morning and
got panic. I'm not 100% sure if it's because of TSO because in a last
monts i had all kinds of diffs on production boxes.
Now I will run spanshot maybe clean spanshot will panic :))

I've couldn't trigger panic with TSO diff in lab ..


panic:

bcbnfw1# panic: kernel diagnostic assertion "m->m_len >= ETHER_HDR_LEN"
failed: file "/sys/net/bpf.c", line 1489
Stopped at  db_enter+0x10:  popq%rbp
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
 444766  59724  0 0x14000  0x2001  softnet
db_enter() at db_enter+0x10
panic(81f25898) at panic+0xbf
__assert(81f99ec1,81fd1b33,5d1,81f465eb) at
__assert+0x25
bpf_mtap_ether(80489600,fd80610e0f00,1) at bpf_mtap_ether+0xef
ifiq_input(8048eb00,800020b59b10) at ifiq_input+0xf3
ixgbe_rxeof(8048d3a0) at ixgbe_rxeof+0x32b
ixgbe_queue_intr(8048ab00) at ixgbe_queue_intr+0x3c
intr_handler(800020b59c50,8045dd00) at intr_handler+0x6e
Xintr_ioapic_edge0_untramp() at Xintr_ioapic_edge0_untramp+0x18f
acpicpu_idle() at acpicpu_idle+0x203
sched_idle(800020892ff0) at sched_idle+0x280
end trace frame: 0x0, count: 4
https://www.openbsd.org/ddb.html describes the minimum info required in
bug reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{2}>


ddb{2}> show reg
rdi0
rsi 0x14
rbp   0x800020b59930
rbx   0xfd80610e0f00
rdx   0xc800
rcx0x282
rax 0x68
r8 0x101010101010101
r9 0
r10   0xd410e1bbe041370a
r11   0xff1c7218c30edf0a
r12   0x800020893a60
r13   0x800020b59a90
r140
r15   0x81f25898cmd0646_9_tim_udma+0x29032
rip   0x81813a30db_enter+0x10
cs   0x8
rflags 0x286
rsp   0x800020b59930
ss  0x10
db_enter+0x10:  popq%rbp


ddb{2}> ps
   PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
 58540  192000  34045  0  30x100083  ttyin ksh
 34045  380261  43095   1000  30x10008b  sigsusp   ksh
 43095  382180  79096   1000  30x98  kqreadsshd
 79096  410005  83400  0  30x82  kqreadsshd
 67346  388175  1  0  30x100083  ttyin ksh
  1576  379710  1  0  30x100098  kqreadcron
 80308  475205  63710720  3   0x190  kqreadlldpd
 63710  264590  1  0  30x80  netio lldpd
 70690  148684  79519 95  3   0x1100092  kqreadsmtpd
 96229  368855  79519103  3   0x1100092  kqreadsmtpd
 57602  230927  79519 95  3   0x1100092  kqreadsmtpd
 27959  412218  79519 95  30x100092  kqreadsmtpd
 68321  253958  79519 95  3   0x1100092  kqreadsmtpd
 15048  386230  79519 95  3   0x1100092  kqreadsmtpd
 79519   40897  1  0  30x100080  kqreadsmtpd
 81998   89562  1 77  3   0x1100090  kqreaddhcpd
 30223   19018  1  0  30x100080  kqreadsnmpd
 16185  104619  1 91  3   0x192  kqreadsnmpd
 83400  393345  1  0  30x88  kqreadsshd
 59135   14654  1  0  30x100080  kqreadntpd
 93144  455149  50136 83  30x100092  kqreadntpd
 50136  231167  1 83  3   0x1100092  kqreadntpd
 26285  193609  59081 74  3   0x1100092  bpf   pflogd
 59081  514111  1  0  30x80  netio pflogd
 90237  222054  74741 73  3   0x1100090  kqreadsyslogd
 74741   91812  1  0  30x100082  netio syslogd
 50581  332657  0  0  3 0x14200  bored smr
 11773  152326  0  0  3 0x14200  pgzerozerothread
 45553   54153  0  0  3 0x14200  aiodoned  aiodoned
 31391  475879  0  0  3 0x14200  syncerupdate
 44830  301726  0  0  3 0x14200  cleaner   cleaner
 62337   61034  0  0  3 0x14200  reaperreaper
 97528  277547   

Re: dump(8) wording

2022-06-04 Thread Theo de Raadt
Jason McIntyre  wrote:

e> On Thu, Jun 02, 2022 at 02:49:43PM +0200, Jan Stary wrote:
> > The following wording of dump(8)
> > can IMHO be be simplified without any loss:
> > 
> > Rewinding or ejecting tape features after a close operation
> > on a tape device depend on the name of the tape unit device used.
> > 
> > I am not a native speaker; but if I parse that right,
> > what "features" are those? Rewinding or ejecting?
> > Then just say that. (Surely dump does not "rewind tape features"
> > or "eject tape features"). Also, "feature" being both a noun and a verb
> > requires extra parsing effort, at least for a non-native speaker. So:
> > 
> > Rewinding or ejecting a tape after a close operation
> > on a tape device depends on the name of the tape unit device.
> > 
> 
> so, this piece of text in dump(8) is talking about "features" because,
> eject and rewind can apparently happen in different ways. if you read
> "EJECT AND REWIND" in st(4), which the text directs you to in the next
> sentence, you'll see what it's getting at.

I think it is great that dump hints at such behaviours in a general
sense.  Someone who needs to dig deeper, will do so.

jmc, I think you are right about the sentence hinting about a variety-pack
of behaviours.

> i personally can;t really see a simpler way to say it. i suppose we
> could scrap the entire paragraph and just send people to st(4), but i'm
> not sure if that's any more helpful.

there used to be tape drives which were not SCSI, and in the future
there may be such devices again.

What's going on here is that st(4) describes the behaviours in a
"scsi tape specification way".  The vax tree used to have a whole
bunch of non-scsi tape drivers, and more exact and discrete behaviours
existed in many of those.

So no I don't think we can point people _directly_ at st(4), they have
to discern that themselves.

People will find the st(4) manual page easily.

> > I don't know the difference between a "tape device"
> > and a "tape unit device", so I left the "unit" there;
> > if it's superfluous, it can perhaps be "name of the device".
> > 
> > Also:
> > 
> > dump requires operator intervention on these conditions:
> > end of tape, end of dump, ...
> > 
> > dump never required any intervention from me on an "end of dump",
> > it simply says DUMP IS DONE. Would "volume" be more precise here?
> > I don't use tapes, but I suppose an intervention is in order
> > at "end of volume" or perhaps "end of media".
> > 
> 
> never having used a tape device, i can't say anything about that.

I don't understand what the proposal is.



Re: dump(8) wording

2022-06-04 Thread Jason McIntyre
On Thu, Jun 02, 2022 at 02:49:43PM +0200, Jan Stary wrote:
> The following wording of dump(8)
> can IMHO be be simplified without any loss:
> 
>   Rewinding or ejecting tape features after a close operation
>   on a tape device depend on the name of the tape unit device used.
> 
> I am not a native speaker; but if I parse that right,
> what "features" are those? Rewinding or ejecting?
> Then just say that. (Surely dump does not "rewind tape features"
> or "eject tape features"). Also, "feature" being both a noun and a verb
> requires extra parsing effort, at least for a non-native speaker. So:
> 
>   Rewinding or ejecting a tape after a close operation
>   on a tape device depends on the name of the tape unit device.
> 

so, this piece of text in dump(8) is talking about "features" because,
eject and rewind can apparently happen in different ways. if you read
"EJECT AND REWIND" in st(4), which the text directs you to in the next
sentence, you'll see what it's getting at.

i personally can;t really see a simpler way to say it. i suppose we
could scrap the entire paragraph and just send people to st(4), but i'm
not sure if that's any more helpful.

> I don't know the difference between a "tape device"
> and a "tape unit device", so I left the "unit" there;
> if it's superfluous, it can perhaps be "name of the device".
> 
> Also:
> 
>   dump requires operator intervention on these conditions:
>   end of tape, end of dump, ...
> 
> dump never required any intervention from me on an "end of dump",
> it simply says DUMP IS DONE. Would "volume" be more precise here?
> I don't use tapes, but I suppose an intervention is in order
> at "end of volume" or perhaps "end of media".
> 

never having used a tape device, i can't say anything about that.

jmc



Re: httpd: add include_dir keyword

2022-06-04 Thread mfrench
> I do not understand why it is believed that people will generate
> better configurations if they split the parts out into different
> files.

I can not speak for "better" configurations as a result of dividing up
the main configuration file. Although I believe it lowers the risk of
mistakenly mangling parts of the configuration while otherwise working
within a single file. The risk increases when changes are commonplace.
I frequently add and remove server definitions as resouces I am hosting
change. I already use the existing include directive for this very
reason.
 
> Adding that kind of trick to an already established grammer rarely works
> well.  It only works in narrowly constrained uses of the old grammer,
> because now one must consider what is in the included files.  At that
> point, why the extra files?  It does not require less brainpower, it
> potentially requires more, when the included files start interfering
> with the core.
> 
> This feels ripe for abuse, and of not much use.

Adding support for including globbed paths may be somewhat offensive to
the existing include grammer, and as you mentioned users may overlook
something wrong or even malicious in their include path. Luckily users
can still include specific files or they can completely forgo use of
the include directive. Users would effectivly opt-in by providing a
globbed path to include. This is admittedly a minor convenience feature
which may not be useful enough to overcome the security risks. 

In my workflow, I currently add/remove (uncomment/comment) include lines
in my main configuration.  Where as with globbed includes I would 
`import "/etc/httpd.d/*.conf"` enabling me to switch between .conf and
.disabled and restart the daemon to switch virtual servers on or off.

In any case, I was not ready to share my changes for the reasons
discussed above and because my code is neither complete nor correct. I
decided to share since I was working in the same vein. Thank you for
your consideration.
-Matt