httpd(8): login/logout delay on WordPress sites after May 17 patch

2021-05-17 Thread Matthias Pressfreund
After the May 17 httpd(8) patch, logging in and logging out of the admin
interface of WordPress sites is delayed by 1 minute. The site itself
appears normally and there are no entries in the error log.



Re: httpd(8): don't try to chunk-encode an empty body

2021-05-17 Thread Sebastian Benoit
Florian Obser(flor...@openbsd.org) on 2021.05.15 17:14:38 +0200:
> Turns out it's not that difficult to do this correctly since we already
> wait until we read all http headers from the fcgi upstream. We just need
> to delay writing of the http header until we know if the body is empty
> or not.
> 
> OK?

seems to work, ok benno@

The comments in server_fcgi_header seem to suggest more dragons lurk in this
area.

> 
> 
> diff --git httpd.h httpd.h
> index b3a40b3af68..c4adfba232d 100644
> --- httpd.h
> +++ httpd.h
> @@ -300,6 +300,7 @@ struct fcgi_data {
>   int  end;
>   int  status;
>   int  headersdone;
> + int  headerssent;
>  };
>  
>  struct range {
> diff --git server_fcgi.c server_fcgi.c
> index 64a0e9d2abb..e1e9704c920 100644
> --- server_fcgi.c
> +++ server_fcgi.c
> @@ -114,6 +114,7 @@ server_fcgi(struct httpd *env, struct client *clt)
>   clt->clt_fcgi.toread = sizeof(struct fcgi_record_header);
>   clt->clt_fcgi.status = 200;
>   clt->clt_fcgi.headersdone = 0;
> + clt->clt_fcgi.headerssent = 0;
>  
>   if (clt->clt_srvevb != NULL)
>   evbuffer_free(clt->clt_srvevb);
> @@ -544,22 +545,20 @@ server_fcgi_read(struct bufferevent *bev, void *arg)
>   if (!clt->clt_fcgi.headersdone) {
>   clt->clt_fcgi.headersdone =
>   server_fcgi_getheaders(clt);
> - if (clt->clt_fcgi.headersdone) {
> - if (server_fcgi_header(clt,
> - clt->clt_fcgi.status)
> - == -1) {
> - server_abort_http(clt,
> - 500,
> - "malformed fcgi "
> - "headers");
> - return;
> - }
> - }
>   if (!EVBUFFER_LENGTH(clt->clt_srvevb))
>   break;
>   }
>   /* FALLTHROUGH */
>   case FCGI_END_REQUEST:
> + if (clt->clt_fcgi.headersdone &&
> + !clt->clt_fcgi.headerssent) {
> + if (server_fcgi_header(clt,
> + clt->clt_fcgi.status) == -1) {
> + server_abort_http(clt, 500,
> + "malformed fcgi headers");
> + return;
> + }
> + }
>   if (server_fcgi_writechunk(clt) == -1) {
>   server_abort_http(clt, 500,
>   "encoding error");
> @@ -600,6 +599,8 @@ server_fcgi_header(struct client *clt, unsigned int code)
>   char tmbuf[32];
>   struct kv   *kv, *cl, key;
>  
> + clt->clt_fcgi.headerssent = 1;
> +
>   if (desc == NULL || (error = server_httperror_byid(code)) == NULL)
>   return (-1);
>  
> @@ -615,6 +616,12 @@ server_fcgi_header(struct client *clt, unsigned int code)
>   if (kv_add(>http_headers, "Server", HTTPD_SERVERNAME) == NULL)
>   return (-1);
>  
> + if (clt->clt_fcgi.type == FCGI_END_REQUEST ||
> + EVBUFFER_LENGTH(clt->clt_srvevb) == 0) {
> + /* Can't chunk encode an empty body. */
> + clt->clt_fcgi.chunked = 0;
> + }
> +
>   /* Set chunked encoding */
>   if (clt->clt_fcgi.chunked) {
>   /* XXX Should we keep and handle Content-Length instead? */
> 
> 
> -- 
> I'm not entirely sure you are real.
> 



Re: [External] : Re: parallel forwarding vs. bridges

2021-05-17 Thread Alexandr Nedvedicky
Hello,

On Mon, May 17, 2021 at 08:19:37PM +0200, Alexander Bluhm wrote:
> On Mon, May 17, 2021 at 04:24:15PM +0200, Alexandr Nedvedicky wrote:
> > in current tree the ether_input() is protected by NET_LOCK(), which is 
> > grabbed
> > by caller as a writer. bluhm's diff changes NET_LOCK() readlock, so
> > ether_input() can run concurrently. Switching NET_LOCK() to r-lock has
> > implications on smr read section above. The ting is the call to 
> > eb->eb_input()
> > can sleep now. This is something what needs to be avoided within smr 
> > section.
> 
> Do you think the sleeping occures in PF_LOCK()?  Could it be that
> this rwlock did never sleep before, but was acquired instantly?
> 

yes, the sleep occurs on PF_LOCK(), right here in pf_test():
7052a = s->anchor.ptr;
7053#if NPFLOG > 0
7054pd.pflog |= s->log;
7055#endif  /* NPFLOG > 0 */
7056} else if (s == NULL) {
7057PF_LOCK();
7058have_pf_lock = 1;
7059action = pf_test_rule(, , , , 
,
7060);
7061s = pf_state_ref(s);
(gdb)

with exclusive NET_LOCK() the PF_LOCK() never blocks/sleeps, because
exclusive NET_LOCK() ensures there is only single packet running in
pf_test().  changing exclusive NET_LOCK() to reader lock allows more
packets to run in pf_test(), therefore PF_LOCK() can block.


> Having sleeps in the packet forwarding path is a bad idea.  And
> refcounting per packet looks like a workarund.  It does not make
> things faster.

I'm afraid sleeps (exclusive locks) can not be avoided with current data
structures we have in PF. for example if packet will be creating state,
then it must grab state lock exclusively.

we have to make sure the sleep operation won't occur in code, which
is protected by lock. Such risk comes from blocking malloc being
called by caller, which owns lock (either reader or writer).

> 
> How about implementing PF_LOCK() with a mutex?

As I've answered to mpi@ already, there are two locks, which
we need to change to mutexes:
PF_LOCK() and state lock
> 
> I made such a diff a few months ago.  It is not well tested yet.
> There were some problems when allocating pf mutex in ioctl path.
> mpi@ did not like it, as it removes the pf lock in pf ioctl and
> relies on net lock for reconfigure.

I've read through your diff. If I understand it right two things
are being changed:
PF_LOCK() is thrown away completely and we use NET_LOCK() instead

state_lock is changed to mutex

IMO removing PF_LOCK() is step backward. I think every subsystem (including
PF) should provide its own lock and should not piggy back on NET_LOCK().
> 
> It is a design question.  Do we want to keep net lock for reconfigure?

No, we don't want NET_LOCK() for reconfigure. configuration changes
should be synchronized using PF_LOCK()

> If we process packets with shared netlock and reconfigure pf with
> exclusive netlock this looks reasonable.
> 
> Do we want to go this way?  Then I will polish and test this diff.

no I think we don't want to go that way.

if it comes to pf(4) configuration is more colorful.  the ioctl, which updates
firewall configuration grabs lock exclusively (be it current pf_lock or
net_lock). For certain operations (queue management and tables) we also
allocate memory _without_ NOSLEEP. We may end up in situation when we are 
sleeping in malloc while holding lock, which we share with packets.

Therefore I think we actually need extra rw-lock just for ioctl.
this new lock will serialize all callers which will come from userland.
we don't mind to sleep on malloc with ioctl-rw lock held exclusively.

as soon as memory for desired operation will be acquired, then it
will be time to grab PF_LOCK() as a writer and update firewall config.
this way we should avoid blocking processing of packets due to
memory allocations.

I think Vitaliy implemented something similar for other subsystem,
can't remember, which one (?pipex?).

thanks and
regards
sashan



Re: [External] : Re: parallel forwarding vs. bridges

2021-05-17 Thread Alexandr Nedvedicky
On Mon, May 17, 2021 at 08:27:12PM +0200, Martin Pieuchot wrote:
> On 17/05/21(Mon) 19:52, Alexandr Nedvedicky wrote:
> > [...] 
> > I don't mind to trade pf_lock and pf_state_lock for mutexes, however
> > I see such step is kind of 'NO-OP'. We do have sufficient measure
> > currently, which is: keep NET_LOCK() as is. May be I'm not seeing
> > your idea/plan behind changing pf's rw-locks to mutexes. If you feel
> > there is a benefit to go that way, then let's do it, but I'd like
> > to understand where we will be going/what to expect.
> 
> I've no idea or plan.  I'm just pointing out that using rwlocks, for the
> moment, add extra work.  If it's easy to use mutexes then you might want
> to start with that.  The whole network processing path assumes it runs
> without sleeping.  I've no idea what can happen when this assumption
> will be broken.
> 

I see. I keep forgetting about local bound traffic, which is
far more complex than forwarding scenario.

> I'm well aware that using a single big pf lock is not the best for
> performances, but maybe it's easier to do baby steps.
> 
> That said, I don't want to stop or discourage anyone.  If you're
> confident enough that rwlocks are the way to go, then please, go
> ahead.
> 

I think we can keep mutexes as a kind of fallback plan if we decide
to commit bluhm's diff.

thanks and
regards
sashan



Re: ftpd(8): remove useless parameter of get_line()

2021-05-17 Thread Chris Cappuccio
Jan Klemkow [j.klem...@wemelug.de] wrote:
> Hi,
> 
> This diff removes the useless FILE* parameter of get_line().  In every
> call this parameter is always "stdin".  Thus, we can replace ever use of
> the variable iop with stdin.
> 
> Like every other diff, I tested this diff with the ftpd regression
> tests.
> 
> OK?
> 

looks fine to me



Re: [External] : Re: parallel forwarding vs. bridges

2021-05-17 Thread Martin Pieuchot
On 17/05/21(Mon) 19:52, Alexandr Nedvedicky wrote:
> [...] 
> I don't mind to trade pf_lock and pf_state_lock for mutexes, however
> I see such step is kind of 'NO-OP'. We do have sufficient measure
> currently, which is: keep NET_LOCK() as is. May be I'm not seeing
> your idea/plan behind changing pf's rw-locks to mutexes. If you feel
> there is a benefit to go that way, then let's do it, but I'd like
> to understand where we will be going/what to expect.

I've no idea or plan.  I'm just pointing out that using rwlocks, for the
moment, add extra work.  If it's easy to use mutexes then you might want
to start with that.  The whole network processing path assumes it runs
without sleeping.  I've no idea what can happen when this assumption
will be broken.

I'm well aware that using a single big pf lock is not the best for
performances, but maybe it's easier to do baby steps.

That said, I don't want to stop or discourage anyone.  If you're
confident enough that rwlocks are the way to go, then please, go
ahead.

Cheers,
Martin



Re: parallel forwarding vs. bridges

2021-05-17 Thread Alexander Bluhm
On Mon, May 17, 2021 at 04:24:15PM +0200, Alexandr Nedvedicky wrote:
> in current tree the ether_input() is protected by NET_LOCK(), which is grabbed
> by caller as a writer. bluhm's diff changes NET_LOCK() readlock, so
> ether_input() can run concurrently. Switching NET_LOCK() to r-lock has
> implications on smr read section above. The ting is the call to eb->eb_input()
> can sleep now. This is something what needs to be avoided within smr section.

Do you think the sleeping occures in PF_LOCK()?  Could it be that
this rwlock did never sleep before, but was acquired instantly?

Having sleeps in the packet forwarding path is a bad idea.  And
refcounting per packet looks like a workarund.  It does not make
things faster.

How about implementing PF_LOCK() with a mutex?

I made such a diff a few months ago.  It is not well tested yet.
There were some problems when allocating pf mutex in ioctl path.
mpi@ did not like it, as it removes the pf lock in pf ioctl and
relies on net lock for reconfigure.

It is a design question.  Do we want to keep net lock for reconfigure?
If we process packets with shared netlock and reconfigure pf with
exclusive netlock this looks reasonable.

Do we want to go this way?  Then I will polish and test this diff.

bluhm

Index: net/pf.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/pf.c,v
retrieving revision 1.1116
diff -u -p -r1.1116 pf.c
--- net/pf.c27 Apr 2021 09:38:29 -  1.1116
+++ net/pf.c17 May 2021 17:38:20 -
@@ -6819,6 +6819,8 @@ pf_test(sa_family_t af, int fwdir, struc
if (!pf_status.running)
return (PF_PASS);
 
+   NET_ASSERT_LOCKED();
+
 #if NCARP > 0
if (ifp->if_type == IFT_CARP &&
(ifp0 = if_get(ifp->if_carpdevidx)) != NULL) {
Index: net/pf_ioctl.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/pf_ioctl.c,v
retrieving revision 1.363
diff -u -p -r1.363 pf_ioctl.c
--- net/pf_ioctl.c  9 Feb 2021 23:37:54 -   1.363
+++ net/pf_ioctl.c  17 May 2021 17:38:20 -
@@ -146,8 +146,8 @@ TAILQ_HEAD(pf_tags, pf_tagname) pf_tags 
  * grab the lock as writer. Whenever packet creates state it grabs pf_lock
  * first then it locks pf_state_lock as the writer.
  */
-struct rwlock   pf_lock = RWLOCK_INITIALIZER("pf_lock");
-struct rwlock   pf_state_lock = RWLOCK_INITIALIZER("pf_state_lock");
+struct mutex   pf_lock = MUTEX_INITIALIZER(IPL_SOFTNET);
+struct mutex   pf_state_lock = MUTEX_INITIALIZER(IPL_SOFTNET);
 
 #if (PF_QNAME_SIZE != PF_TAG_NAME_SIZE)
 #error PF_QNAME_SIZE must be equal to PF_TAG_NAME_SIZE
@@ -1009,7 +1009,6 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
 
case DIOCSTART:
NET_LOCK();
-   PF_LOCK();
if (pf_status.running)
error = EEXIST;
else {
@@ -1023,13 +1022,11 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
pf_create_queues();
DPFPRINTF(LOG_NOTICE, "pf: started");
}
-   PF_UNLOCK();
NET_UNLOCK();
break;
 
case DIOCSTOP:
NET_LOCK();
-   PF_LOCK();
if (!pf_status.running)
error = ENOENT;
else {
@@ -1038,7 +1035,6 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
pf_remove_queues();
DPFPRINTF(LOG_NOTICE, "pf: stopped");
}
-   PF_UNLOCK();
NET_UNLOCK();
break;
 
@@ -1048,7 +1044,6 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
u_int32_tnr = 0;
 
NET_LOCK();
-   PF_LOCK();
pq->ticket = pf_main_ruleset.rules.active.ticket;
 
/* save state to not run over them all each time? */
@@ -1058,7 +1053,6 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
nr++;
}
pq->nr = nr;
-   PF_UNLOCK();
NET_UNLOCK();
break;
}
@@ -1069,10 +1063,8 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
u_int32_tnr = 0;
 
NET_LOCK();
-   PF_LOCK();
if (pq->ticket != pf_main_ruleset.rules.active.ticket) {
error = EBUSY;
-   PF_UNLOCK();
NET_UNLOCK();
break;
}
@@ -1083,12 +1075,10 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
qs = TAILQ_NEXT(qs, entries);
if (qs == NULL) {
error = EBUSY;
-   PF_UNLOCK();
NET_UNLOCK();
break;
}
memcpy(>queue, qs, 

Re: [External] : Re: parallel forwarding vs. bridges

2021-05-17 Thread Alexandr Nedvedicky
Hello,

On Mon, May 17, 2021 at 06:59:36PM +0200, Martin Pieuchot wrote:
> On 17/05/21(Mon) 16:24, Alexandr Nedvedicky wrote:
> > Hrvoje,
> > 
> > managed to trigger diagnostic panics with diff [1] sent by bluhm@
> > some time ago. The panic Hrvoje sees comes from ether_input() here:
> > 
> >  414 
> >  415 /*
> >  416  * Third phase: bridge processing.
> >  417  *
> >  418  * Give the packet to a bridge interface, ie, bridge(4),
> >  419  * switch(4), or tpmr(4), if it is configured. A bridge
> >  420  * may take the packet and forward it to another port, or it
> >  421  * may return it here to ether_input() to support local
> >  422  * delivery to this port.
> >  423  */
> >  424 
> >  425 ac = (struct arpcom *)ifp;
> >  426 
> >  427 smr_read_enter();
> >  428 eb = SMR_PTR_GET(>ac_brport);
> >  429 if (eb != NULL) {
> >  430 m = (*eb->eb_input)(ifp, m, dst, eb->eb_port);
> >  431 if (m == NULL) {
> >  432 smr_read_leave();
> >  433 return;
> >  434 }
> >  435 }
> >  436 smr_read_leave();
> >  437 
> > 
> > in current tree the ether_input() is protected by NET_LOCK(), which is 
> > grabbed
> > by caller as a writer. bluhm's diff changes NET_LOCK() readlock, so
> > ether_input() can run concurrently. Switching NET_LOCK() to r-lock has
> > implications on smr read section above. The ting is the call to 
> > eb->eb_input()
> > can sleep now. This is something what needs to be avoided within smr 
> > section.
> 
> Is the new sleeping point introduced by the fact the PF_LOCK() is a
> rwlock?  Did you consider using a mutex, at least for the time being,
> in order to not run in such issues?

there are two locks in pf_test():
PF_LOCK()

PF_STATE_LOCK()

we would have to convert both locks into mutexes.

trading PF_STATE_LOCK() into mutex is step backward IMO.
bluhm's parallel diff enables PF to run in parallel for packets.
if packet matches existing state, it just grabs state lock as
a reader. This brings huge benefit, because majority of packets
just do the state look up. Only some packets need to go expensive
way through rules and state creation, which involves PF_LOCK() and
write lock on state rw.

as readers and are done with PF.

trading both locks for mutexes will do the trick too. However I'm not
sure I see a benefit there (when comparing to current NET_LOCK() we have
in tree).

I think all safety belts we have kicked in. We learned smr read block is
not an option for ether_input() when comes to bridges.  perhaps alternative
solution would be to introduce an RW-lock, which would protect 'ac->br_port'.
Such R-lock would cover call ->eb_input(). However I'm not sure about it.

I don't mind to trade pf_lock and pf_state_lock for mutexes, however
I see such step is kind of 'NO-OP'. We do have sufficient measure
currently, which is: keep NET_LOCK() as is. May be I'm not seeing
your idea/plan behind changing pf's rw-locks to mutexes. If you feel
there is a benefit to go that way, then let's do it, but I'd like
to understand where we will be going/what to expect.

thanks and
regards
sashan



Re: parallel forwarding vs. bridges

2021-05-17 Thread Hrvoje Popovski
On 17.5.2021. 16:24, Alexandr Nedvedicky wrote:
> Hrvoje,
> 
> managed to trigger diagnostic panics with diff [1] sent by bluhm@
> some time ago. The panic Hrvoje sees comes from ether_input() here:

> [1] https://marc.info/?l=openbsd-tech=161903387904923=2
> 
> [2] https://marc.info/?l=openbsd-tech=162099270106067=2

for those interested, I sent emails to a couple of developers with the
following panics

c/p

Hi guys,

i'm testing bluhm@ parallel diff with pf and tpmr/veb/bridge.
if i'm sending traffic over tmpr or veb, the same second i enable pf,
box panic.. see attachment ..

with bridge i can't panic box, but it seems that performance is even
worse than without parallel diff ... i will play with it more, maybe
it's just some glitch ...
pf and tpmr

r620-1# panic: kernel diagnostic assertion 
"curcpu()->ci_schedstate.spc_smrdepth == 0" failed: file "/sys/kern/sub
r_xxx.c", line 163
Stopped at  db_enter+0x10:  popq%rbp
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
  11538  95266  0 0x14000  0x2002  softnet
 239565   4172  0 0x14000  0x2001  softnet
db_enter() at db_enter+0x10
panic(81e0198a) at panic+0x12a
__assert(81e6f4ee,81e93713,a3,81ea877c) at __assert+0x2b
assertwaitok() at assertwaitok+0xdc
mi_switch() at mi_switch+0x40
sleep_finish(800023877740,1) at sleep_finish+0x11c
rw_enter(8212e900,1) at rw_enter+0x229
pf_test(2,1,80082048,800023877978) at pf_test+0x1055
tpmr_pf(80082048,1,fd80cccb8c00) at tpmr_pf+0x7e
tpmr_input(80082048,fd80cccb8c00,90e2ba1d4f89,80680f00) at 
tpmr_input+0x124
ether_input(80082048,fd80cccb8c00) at ether_input+0xf5
if_input_process(80082048,800023877af8) at if_input_process+0x6f
ifiq_process(80086000) at ifiq_process+0x69
taskq_thread(80030300) at taskq_thread+0x9f
end trace frame: 0x0, count: 1
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{0}>


ddb{0}> show panic
kernel diagnostic assertion "curcpu()->ci_schedstate.spc_smrdepth == 0" failed:
 file "/sys/kern/subr_xxx.c", line 163


ddb{0}> show locks
shared rwlock netlock r = 0 (0x8212eab0)
#0  witness_lock+0x339
#1  if_input_process+0x43
#2  ifiq_process+0x69
#3  taskq_thread+0x9f
#4  proc_trampoline+0x1c
shared rwlock softnet r = 0 (0x80030370)
#0  witness_lock+0x339
#1  taskq_thread+0x92
#2  proc_trampoline+0x1c
exclusive sched_lock _lock r = 0 (0x8230a100)
#0  witness_lock+0x339
#1  sleep_setup+0xa5
#2  rw_enter+0x208
#3  pf_test+0x1055
#4  tpmr_pf+0x7e
#5  tpmr_input+0x124
#6  ether_input+0xf5
#7  if_input_process+0x6f
#8  ifiq_process+0x69
#9  taskq_thread+0x9f
#10 proc_trampoline+0x1c


ddb{0}> show reg
rdi   0x820f1c80kprintf_mutex
rsi  0x5
rbp   0x8000238775a0
rbx   0x8000238775b0
rdx   0x8000
rcx0x206
rax  0x1
r80x800023877560
r9 0
r100
r11   0x80ef1bc6ac87b49d
r12 0x38
r13   0x800023877650
r140x100
r15   0x81e0198acmd0646_9_tim_udma+0x24a93
rip   0x81106070db_enter+0x10
cs   0x8
rflags 0x282
rsp   0x8000238775a0
ss  0x10
db_enter+0x10:  popq%rbp



ddb{0}> show mbuf
mbuf 0x81106070
m_type: -13108  m_flags: c3cc
m_next: 0x1d3b4c241c334c5d  m_nextpkt: 0x117400fdffac
m_data: 0x  m_len: 3435973836
m_dat: 0x81106090   m_pktdat: 0x811060e0


ddb{0}> ps
   PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
 80810  268029  1  0  30x100083  ttyin ksh
 90433  180245  1  0  30x100098  poll  cron
 921768747  42701 95  30x100092  kqreadsmtpd
 83989   36449  42701103  30x100092  kqreadsmtpd
 82331  325465  42701 95  30x100092  kqreadsmtpd
 28903  422168  42701 95  30x100092  kqreadsmtpd
  5781   10814  42701 95  30x100092  kqreadsmtpd
 70395   59130  42701 95  30x100092  kqreadsmtpd
 42701  329509  1  0  30x100080  kqreadsmtpd
 47357  342015  1  0  30x80  selectsshd
 91828  134960  1  0  30x100080  poll  ntpd
 42264  230567  72970 83  30x100092  poll  ntpd
 72970  150962  1 83  30x100092  poll  ntpd
 23464  496327  92870 73  30x100090  kqreadsyslogd
 92870  181949  1  0  30x100082  netio syslogd
 15150   95273

Re: parallel forwarding vs. bridges

2021-05-17 Thread Martin Pieuchot
On 17/05/21(Mon) 16:24, Alexandr Nedvedicky wrote:
> Hrvoje,
> 
> managed to trigger diagnostic panics with diff [1] sent by bluhm@
> some time ago. The panic Hrvoje sees comes from ether_input() here:
> 
>  414 
>  415 /*
>  416  * Third phase: bridge processing.
>  417  *
>  418  * Give the packet to a bridge interface, ie, bridge(4),
>  419  * switch(4), or tpmr(4), if it is configured. A bridge
>  420  * may take the packet and forward it to another port, or it
>  421  * may return it here to ether_input() to support local
>  422  * delivery to this port.
>  423  */
>  424 
>  425 ac = (struct arpcom *)ifp;
>  426 
>  427 smr_read_enter();
>  428 eb = SMR_PTR_GET(>ac_brport);
>  429 if (eb != NULL) {
>  430 m = (*eb->eb_input)(ifp, m, dst, eb->eb_port);
>  431 if (m == NULL) {
>  432 smr_read_leave();
>  433 return;
>  434 }
>  435 }
>  436 smr_read_leave();
>  437 
> 
> in current tree the ether_input() is protected by NET_LOCK(), which is grabbed
> by caller as a writer. bluhm's diff changes NET_LOCK() readlock, so
> ether_input() can run concurrently. Switching NET_LOCK() to r-lock has
> implications on smr read section above. The ting is the call to eb->eb_input()
> can sleep now. This is something what needs to be avoided within smr section.

Is the new sleeping point introduced by the fact the PF_LOCK() is a
rwlock?  Did you consider using a mutex, at least for the time being,
in order to not run in such issues?



Re: httpd with rdomain

2021-05-17 Thread Theo de Raadt
Claudio Jeker  wrote:

> I don't think httpd should handle rdomain in its config but instead you
> should run multiple httpd processes one for each rdomain.
> Not every daemon needs to support running in multiple rdomains at once.

I agree with you Claudio.

This in particular is what lost me:

> > -   if (pledge("stdio rpath wpath cpath inet dns sendfd", NULL) == -1)
> > +   if (pledge("stdio rpath wpath cpath inet dns sendfd wroute", NULL) == 
> > -1)

We do not want to give httpd additional permissions.





Re: httpd with rdomain

2021-05-17 Thread Peter J. Philipp
On Mon, May 17, 2021 at 05:55:00PM +0200, Claudio Jeker wrote:
[..]
> > Granted I did not test it with a difficult config.  And I advise any 
> > committer
> > to test this fully before trusting my code.  For me it's better than using
> > route and starting httpd twice, though.
> > 
> 
> I don't think httpd should handle rdomain in its config but instead you
> should run multiple httpd processes one for each rdomain.
> Not every daemon needs to support running in multiple rdomains at once.

OK, no worries, I did some work on this and did a simple search for someone
having done a patch like this, and there was no hits.  So the next person
at least finds this and knows. :-) And if in future you guys change your
mind you at least know one way of how it was handled by someone.

Thanks Claudio.

Cheers!
-peter



Re: httpd with rdomain

2021-05-17 Thread Claudio Jeker
On Mon, May 17, 2021 at 05:11:41PM +0200, Peter J. Philipp wrote:
> Hi,
> 
> I found myself wanting this, this morning.  I made a patch but then I put it
> in the wrong spot, and noticed it needed rewriting of SERVER in parse.y.
> Later in the day I found myself looking into this, and a better patch came out
> of it.  It works on a simple setup for me (mind the censored IPv6 address):
> 
> simple config>
> 
> prefork 2
> 
> server "default" {
> log style combined
> listen on 159.69.32.73 port 80
> listen on 2a01:4f8:1c1c:89f1::1 port 80
> 
> root "/htdocs"
> location "/" {
> directory index index.html
> }
> location "/.well-known/acme-challenge/*" {
>root "/acme"
>request strip 2
> }
> }
> 
> server "default" rdomain 4 {
> log style combined
>   listen on Censored port 80
> 
> root "/htdocs"
> location "/" {
> directory index index.html
> }
> 
> location "/.well-known/acme-challenge/*" {
>root "/acme"
>request strip 2
> }
> }
> <--
> 
> Granted I did not test it with a difficult config.  And I advise any committer
> to test this fully before trusting my code.  For me it's better than using
> route and starting httpd twice, though.
> 

I don't think httpd should handle rdomain in its config but instead you
should run multiple httpd processes one for each rdomain.
Not every daemon needs to support running in multiple rdomains at once.

> Patch after my signature.
> 
> Best Regards,
> -peter
> 
> Index: httpd.c
> ===
> RCS file: /cvs/src/usr.sbin/httpd/httpd.c,v
> retrieving revision 1.71
> diff -u -p -r1.71 httpd.c
> --- httpd.c   27 Jan 2021 07:21:52 -  1.71
> +++ httpd.c   17 May 2021 14:56:54 -
> @@ -226,7 +226,7 @@ main(int argc, char *argv[])
>   if (ps->ps_noaction == 0)
>   log_info("startup");
>  
> - if (pledge("stdio rpath wpath cpath inet dns sendfd", NULL) == -1)
> + if (pledge("stdio rpath wpath cpath inet dns sendfd wroute", NULL) == 
> -1)
>   fatal("pledge");
>  
>   event_init();
> Index: httpd.conf.5
> ===
> RCS file: /cvs/src/usr.sbin/httpd/httpd.conf.5,v
> retrieving revision 1.115
> diff -u -p -r1.115 httpd.conf.5
> --- httpd.conf.5  6 Apr 2021 06:28:38 -   1.115
> +++ httpd.conf.5  17 May 2021 14:56:54 -
> @@ -143,13 +143,13 @@ Each
>  section starts with a declaration of the server
>  .Ar name :
>  .Bl -tag -width Ds
> -.It Ic server Ar name Brq ...
> +.It Ic server Ar name [ Ic rdomain Ar num ] Brq ...
>  Match the server name using shell globbing rules.
>  This can be an explicit name,
>  .Ar www.example.com ,
>  or a name including wildcards,
>  .Ar *.example.com .
> -.It Ic server match Ar name Brq ...
> +.It Ic server match Ar name  [ Ic rdomain Ar num ] Brq ...
>  Match the server name using pattern matching,
>  see
>  .Xr patterns 7 .
> Index: httpd.h
> ===
> RCS file: /cvs/src/usr.sbin/httpd/httpd.h,v
> retrieving revision 1.155
> diff -u -p -r1.155 httpd.h
> --- httpd.h   10 Apr 2021 10:10:07 -  1.155
> +++ httpd.h   17 May 2021 14:56:54 -
> @@ -487,6 +487,7 @@ struct server_config {
>  
>   in_port_tport;
>   struct sockaddr_storage  ss;
> + u_intrdomain;
>   int  prefixlen;
>   struct timeval   timeout;
>   struct timeval   requesttimeout;
> Index: parse.y
> ===
> RCS file: /cvs/src/usr.sbin/httpd/parse.y,v
> retrieving revision 1.125
> diff -u -p -r1.125 parse.y
> --- parse.y   10 Apr 2021 10:10:07 -  1.125
> +++ parse.y   17 May 2021 14:56:54 -
> @@ -120,6 +120,8 @@ intgetservice(char *);
>  int   is_if_in_group(const char *, const char *);
>  int   get_fastcgi_dest(struct server_config *, const char *, char *);
>  void  remove_locations(struct server_config *);
> +struct server*serverfill(int, char *, u_int);
> +int   serveropts_fill(struct server *, int);
>  
>  typedef struct {
>   union {
> @@ -139,7 +141,7 @@ typedef struct {
>  %token   LISTEN LOCATION LOG LOGDIR MATCH MAXIMUM NO NODELAY OCSP ON 
> PORT PREFORK
>  %token   PROTOCOLS REQUESTS ROOT SACK SERVER SOCKET STRIP STYLE SYSLOG 
> TCP TICKET
>  %token   TIMEOUT TLS TYPE TYPES HSTS MAXAGE SUBDOMAINS DEFAULT PRELOAD 
> REQUEST
> -%token   ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE
> +%token   ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE 
> RDOMAIN
>  %token   CA CLIENT CRL OPTIONAL PARAM FORWARDED 

httpd with rdomain

2021-05-17 Thread Peter J. Philipp
Hi,

I found myself wanting this, this morning.  I made a patch but then I put it
in the wrong spot, and noticed it needed rewriting of SERVER in parse.y.
Later in the day I found myself looking into this, and a better patch came out
of it.  It works on a simple setup for me (mind the censored IPv6 address):

simple config>

prefork 2

server "default" {
log style combined
listen on 159.69.32.73 port 80
listen on 2a01:4f8:1c1c:89f1::1 port 80

root "/htdocs"
location "/" {
directory index index.html
}
location "/.well-known/acme-challenge/*" {
   root "/acme"
   request strip 2
}
}

server "default" rdomain 4 {
log style combined
listen on Censored port 80

root "/htdocs"
location "/" {
directory index index.html
}

location "/.well-known/acme-challenge/*" {
   root "/acme"
   request strip 2
}
}
<--

Granted I did not test it with a difficult config.  And I advise any committer
to test this fully before trusting my code.  For me it's better than using
route and starting httpd twice, though.

Patch after my signature.

Best Regards,
-peter

Index: httpd.c
===
RCS file: /cvs/src/usr.sbin/httpd/httpd.c,v
retrieving revision 1.71
diff -u -p -r1.71 httpd.c
--- httpd.c 27 Jan 2021 07:21:52 -  1.71
+++ httpd.c 17 May 2021 14:56:54 -
@@ -226,7 +226,7 @@ main(int argc, char *argv[])
if (ps->ps_noaction == 0)
log_info("startup");
 
-   if (pledge("stdio rpath wpath cpath inet dns sendfd", NULL) == -1)
+   if (pledge("stdio rpath wpath cpath inet dns sendfd wroute", NULL) == 
-1)
fatal("pledge");
 
event_init();
Index: httpd.conf.5
===
RCS file: /cvs/src/usr.sbin/httpd/httpd.conf.5,v
retrieving revision 1.115
diff -u -p -r1.115 httpd.conf.5
--- httpd.conf.56 Apr 2021 06:28:38 -   1.115
+++ httpd.conf.517 May 2021 14:56:54 -
@@ -143,13 +143,13 @@ Each
 section starts with a declaration of the server
 .Ar name :
 .Bl -tag -width Ds
-.It Ic server Ar name Brq ...
+.It Ic server Ar name [ Ic rdomain Ar num ] Brq ...
 Match the server name using shell globbing rules.
 This can be an explicit name,
 .Ar www.example.com ,
 or a name including wildcards,
 .Ar *.example.com .
-.It Ic server match Ar name Brq ...
+.It Ic server match Ar name  [ Ic rdomain Ar num ] Brq ...
 Match the server name using pattern matching,
 see
 .Xr patterns 7 .
Index: httpd.h
===
RCS file: /cvs/src/usr.sbin/httpd/httpd.h,v
retrieving revision 1.155
diff -u -p -r1.155 httpd.h
--- httpd.h 10 Apr 2021 10:10:07 -  1.155
+++ httpd.h 17 May 2021 14:56:54 -
@@ -487,6 +487,7 @@ struct server_config {
 
in_port_tport;
struct sockaddr_storage  ss;
+   u_intrdomain;
int  prefixlen;
struct timeval   timeout;
struct timeval   requesttimeout;
Index: parse.y
===
RCS file: /cvs/src/usr.sbin/httpd/parse.y,v
retrieving revision 1.125
diff -u -p -r1.125 parse.y
--- parse.y 10 Apr 2021 10:10:07 -  1.125
+++ parse.y 17 May 2021 14:56:54 -
@@ -120,6 +120,8 @@ int  getservice(char *);
 int is_if_in_group(const char *, const char *);
 int get_fastcgi_dest(struct server_config *, const char *, char *);
 voidremove_locations(struct server_config *);
+struct server  *serverfill(int, char *, u_int);
+int serveropts_fill(struct server *, int);
 
 typedef struct {
union {
@@ -139,7 +141,7 @@ typedef struct {
 %token LISTEN LOCATION LOG LOGDIR MATCH MAXIMUM NO NODELAY OCSP ON PORT PREFORK
 %token PROTOCOLS REQUESTS ROOT SACK SERVER SOCKET STRIP STYLE SYSLOG TCP TICKET
 %token TIMEOUT TLS TYPE TYPES HSTS MAXAGE SUBDOMAINS DEFAULT PRELOAD REQUEST
-%token ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE
+%token ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE RDOMAIN
 %token CA CLIENT CRL OPTIONAL PARAM FORWARDED FOUND NOT
 %token   STRING
 %token   NUMBER
@@ -222,217 +224,42 @@ main : PREFORK NUMBER{
;
 
 server : SERVER optmatch STRING{
-   struct server   *s;
-   struct sockaddr_un  *sun;
-
if (!loadcfg) {
free($3);
YYACCEPT;
}
-
-   if ((s = calloc(1, sizeof (*s))) == NULL)
-  

Re: reposync error

2021-05-17 Thread Hrvoje Popovski
On 17.5.2021. 13:55, Theo Buehler wrote:
> On Mon, May 17, 2021 at 11:24:25AM +0200, Hrvoje Popovski wrote:
>> Hi all,
>>
>> today after sysupgrade i' getting this error with reposync
>>
>> r620-1# su -m cvs -c "reposync -s src rsync://ftp.hostserver.de/cvs
>> /home/cvs"
>> reposync: rsync error: rsync: did not see server greeting
>> rsync error: error starting client-server protocol (code 5) at
>> main.c(1814) [Receiver=3.2.3]
>>
>>
>> i tried reposync on a few hosts and error is the same ..
>>
> This is a bug introduced in ssh.c -r1.554. Fixed in ssh.c -r1.556:


tnx .. .




parallel forwarding vs. bridges

2021-05-17 Thread Alexandr Nedvedicky
Hrvoje,

managed to trigger diagnostic panics with diff [1] sent by bluhm@
some time ago. The panic Hrvoje sees comes from ether_input() here:

 414 
 415 /*
 416  * Third phase: bridge processing.
 417  *
 418  * Give the packet to a bridge interface, ie, bridge(4),
 419  * switch(4), or tpmr(4), if it is configured. A bridge
 420  * may take the packet and forward it to another port, or it
 421  * may return it here to ether_input() to support local
 422  * delivery to this port.
 423  */
 424 
 425 ac = (struct arpcom *)ifp;
 426 
 427 smr_read_enter();
 428 eb = SMR_PTR_GET(>ac_brport);
 429 if (eb != NULL) {
 430 m = (*eb->eb_input)(ifp, m, dst, eb->eb_port);
 431 if (m == NULL) {
 432 smr_read_leave();
 433 return;
 434 }
 435 }
 436 smr_read_leave();
 437 

in current tree the ether_input() is protected by NET_LOCK(), which is grabbed
by caller as a writer. bluhm's diff changes NET_LOCK() readlock, so
ether_input() can run concurrently. Switching NET_LOCK() to r-lock has
implications on smr read section above. The ting is the call to eb->eb_input()
can sleep now. This is something what needs to be avoided within smr section.

diff below introduces a reference counting to bridge port using atomic ops.
the section above is changed to:

 415 /*
 416  * Third phase: bridge processing.
 417  *
 418  * Give the packet to a bridge interface, ie, bridge(4),
 419  * switch(4), or tpmr(4), if it is configured. A bridge
 420  * may take the packet and forward it to another port, or it
 421  * may return it here to ether_input() to support local
 422  * delivery to this port.
 423  */
 424 
 425 ac = (struct arpcom *)ifp;
 426 
 427 smr_read_enter();
 428 eb = SMR_PTR_GET(>ac_brport);
 429 if (eb != NULL)
 430 eb->eb_port_take(eb->eb_port);
 431 smr_read_leave();
 432 if (eb != NULL) {
 433 m = (*eb->eb_input)(ifp, m, dst, eb->eb_port);
 434 eb->eb_port_rele(eb->eb_port);
 435 if (m == NULL) {
 436 return;
 437 }
 438 }
 439 

each bridge module must provide its own implementation for
eb_port_take()/eb_port_rele(). Not sure if it is a way to go. According to
testing done by Hrvoje the diagnostic panic disappears. So we can say diff
solves the problem, but there might be better ways to solve it.

there are other bugs to kill as pointed out here [2].

thanks and
regards
sashan

[1] https://marc.info/?l=openbsd-tech=161903387904923=2

[2] https://marc.info/?l=openbsd-tech=162099270106067=2

8<---8<---8<--8<

diff --git a/sys/net/if_bridge.c b/sys/net/if_bridge.c
index 7dbcf4a2ab0..2afcbb296a0 100644
--- a/sys/net/if_bridge.c
+++ b/sys/net/if_bridge.c
@@ -138,6 +138,8 @@ int bridge_ipsec(struct ifnet *, struct ether_header *, 
int, struct llc *,
 #endif
 int bridge_clone_create(struct if_clone *, int);
 intbridge_clone_destroy(struct ifnet *);
+void   bridge_take(void *);
+void   bridge_rele(void *);
 
 #defineETHERADDR_IS_IP_MCAST(a) \
/* struct etheraddr *a; */  \
@@ -152,6 +154,8 @@ struct if_clone bridge_cloner =
 
 const struct ether_brport bridge_brport = {
bridge_input,
+   bridge_take,
+   bridge_rele,
NULL,
 };
 
@@ -2049,3 +2053,15 @@ bridge_send_icmp_err(struct ifnet *ifp,
  dropit:
m_freem(n);
 }
+
+void
+bridge_take(void *unused)
+{
+   return;
+}
+
+void
+bridge_rele(void *unused)
+{
+   return;
+}
diff --git a/sys/net/if_ethersubr.c b/sys/net/if_ethersubr.c
index c9493b7f857..4b975e2d4a1 100644
--- a/sys/net/if_ethersubr.c
+++ b/sys/net/if_ethersubr.c
@@ -426,14 +426,16 @@ ether_input(struct ifnet *ifp, struct mbuf *m)
 
smr_read_enter();
eb = SMR_PTR_GET(>ac_brport);
+   if (eb != NULL)
+   eb->eb_port_take(eb->eb_port);
+   smr_read_leave();
if (eb != NULL) {
m = (*eb->eb_input)(ifp, m, dst, eb->eb_port);
+   eb->eb_port_rele(eb->eb_port);
if (m == NULL) {
-   smr_read_leave();
return;
}
}
-   smr_read_leave();
 
/*
 * Fourth phase: drop service delimited packets.
diff --git a/sys/net/if_switch.c b/sys/net/if_switch.c
index 22aecdc6b75..ca24597d1f7 100644
--- a/sys/net/if_switch.c
+++ b/sys/net/if_switch.c
@@ -110,6 +110,8 @@ struct mbuf
 voidswitch_flow_classifier_dump(struct switch_softc *,
struct switch_flow_classify *);
 voidswitchattach(int);
+void   switch_take(void *);
+void   switch_rele(void *);
 
 struct 

Re: reposync error

2021-05-17 Thread Theo Buehler
On Mon, May 17, 2021 at 11:24:25AM +0200, Hrvoje Popovski wrote:
> Hi all,
> 
> today after sysupgrade i' getting this error with reposync
> 
> r620-1# su -m cvs -c "reposync -s src rsync://ftp.hostserver.de/cvs
> /home/cvs"
> reposync: rsync error: rsync: did not see server greeting
> rsync error: error starting client-server protocol (code 5) at
> main.c(1814) [Receiver=3.2.3]
> 
> 
> i tried reposync on a few hosts and error is the same ..
> 

This is a bug introduced in ssh.c -r1.554. Fixed in ssh.c -r1.556:

Index: ssh.c
===
RCS file: /cvs/src/usr.bin/ssh/ssh.c,v
retrieving revision 1.555
retrieving revision 1.556
diff -u -p -r1.555 -r1.556
--- ssh.c   14 May 2021 05:20:32 -  1.555
+++ ssh.c   17 May 2021 11:43:16 -  1.556
@@ -1,4 +1,4 @@
-/* $OpenBSD: ssh.c,v 1.555 2021/05/14 05:20:32 djm Exp $ */
+/* $OpenBSD: ssh.c,v 1.556 2021/05/17 11:43:16 djm Exp $ */
 /*
  * Author: Tatu Ylonen 
  * Copyright (c) 1995 Tatu Ylonen , Espoo, Finland
@@ -2126,7 +2126,8 @@ ssh_session2(struct ssh *ssh, const stru
stdin_null_flag = 1;
no_shell_flag = 1;
tty_flag = 0;
-   if (!fork_after_authentication_flag && !ono_shell_flag)
+   if (!fork_after_authentication_flag &&
+   (!ono_shell_flag || options.stdio_forward_host != NULL))
need_controlpersist_detach = 1;
fork_after_authentication_flag = 1;
}



Re: services(5): more cleanup

2021-05-17 Thread Craig Skinner
On Sun, 16 May 2021 01:11:51 + Aisha Tammy wrote:
> I have a few machines which use something to the effect of `pass in on
> egress proto tcp to port smtps ...`.
> A quick question, does this mean that the port in pf.conf will also
> have to be renamed?

From experience doing the same Aisha, # pfctl -nf /etc/pf.conf will
complain if there are unknown port or host names.

These can be checked with getent(1)

As pf starts before unbound when booting, any hostnames used by pf
need to be in /etc/hosts & kept current.


Cheers,
Craig.



uao_dropswap_range()

2021-05-17 Thread Martin Pieuchot
Diff below makes use of uao_dropswap_range() in uao_free() instead of
duplicating it.  This function has been imported from NetBSD along with
TMPFS.  I'd like to use it to reduce the difference with their tree and
reduce the size of my upcoming `vmobjlock' diff.

ok?

Index: uvm/uvm_aobj.c
===
RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v
retrieving revision 1.95
diff -u -p -r1.95 uvm_aobj.c
--- uvm/uvm_aobj.c  22 Apr 2021 11:54:32 -  1.95
+++ uvm/uvm_aobj.c  11 May 2021 11:26:15 -
@@ -351,58 +351,16 @@ uao_set_swslot(struct uvm_object *uobj, 
 static void
 uao_free(struct uvm_aobj *aobj)
 {
+   struct uvm_object *uobj = >u_obj;
 
-   if (UAO_USES_SWHASH(aobj)) {
-   int i, hashbuckets = aobj->u_swhashmask + 1;
+   uao_dropswap_range(uobj, 0, 0);
 
+   if (UAO_USES_SWHASH(aobj)) {
/*
-* free the swslots from each hash bucket,
-* then the hash bucket, and finally the hash table itself.
+* free the hash table itself.
 */
-   for (i = 0; i < hashbuckets; i++) {
-   struct uao_swhash_elt *elt, *next;
-
-   for (elt = LIST_FIRST(>u_swhash[i]);
-elt != NULL;
-elt = next) {
-   int j;
-
-   for (j = 0; j < UAO_SWHASH_CLUSTER_SIZE; j++) {
-   int slot = elt->slots[j];
-
-   if (slot == 0) {
-   continue;
-   }
-   uvm_swap_free(slot, 1);
-   /*
-* this page is no longer
-* only in swap.
-*/
-   atomic_dec_int();
-   }
-
-   next = LIST_NEXT(elt, list);
-   pool_put(_swhash_elt_pool, elt);
-   }
-   }
-
hashfree(aobj->u_swhash, UAO_SWHASH_BUCKETS(aobj->u_pages), 
M_UVMAOBJ);
} else {
-   int i;
-
-   /*
-* free the array
-*/
-   for (i = 0; i < aobj->u_pages; i++) {
-   int slot = aobj->u_swslots[i];
-
-   if (slot) {
-   uvm_swap_free(slot, 1);
-
-   /* this page is no longer only in swap. */
-   atomic_dec_int();
-   }
-   }
free(aobj->u_swslots, M_UVMAOBJ, aobj->u_pages * sizeof(int));
}
 
@@ -1487,9 +1445,6 @@ uao_pagein_page(struct uvm_aobj *aobj, i
 }
 
 /*
- * XXX pedro: Once we are comfortable enough with this function, we can adapt
- * uao_free() to use it.
- *
  * uao_dropswap_range: drop swapslots in the range.
  *
  * => aobj must be locked and is returned locked.



Re: running network stack forwarding in parallel

2021-05-17 Thread Martin Pieuchot
On 16/05/21(Sun) 15:56, Vitaliy Makkoveev wrote:
> 
> 
> > On 14 May 2021, at 14:43, Martin Pieuchot  wrote:
> > 
> > On 13/05/21(Thu) 14:50, Vitaliy Makkoveev wrote:
> >> On Thu, May 13, 2021 at 01:15:05PM +0200, Hrvoje Popovski wrote:
> >>> On 13.5.2021. 1:25, Vitaliy Makkoveev wrote:
>  It seems this lock order issue is not parallel diff specific.
> >>> 
> >>> 
> >>> 
> >>> Yes,  you are right ... it seemed familiar but i couldn't reproduce it
> >>> on lapc trunk or without this diff so i thought that parallel diff is
> >>> one to blame ..
> >>> 
> >>> 
> >>> sorry for noise ..
> >>> 
> >> 
> >> Timeout thread and interface destroy thread are both serialized by
> >> kernel lock so it's hard to catch this issue. So your report is
> >> useful :)
> > 
> > The use of the NET_LOCK() in *clone_destroy() is problematic.  tpmr(4)
> > has a similar problem as reported by Hrvoje in a different thread.  I
> > don't know what it is serializing, hopefully David can tell us more.
> > 
> 
> It serializes detach hook and clone_detach. Detach hooks are executed
> with netlock held. Unfortunately this problem is much complicated,
> and we can’t just introduce new lock to solve it because this will
> introduce lock order issue.

We're talking about different uses of the NET_LOCK().  if_detach() and
if_deactivate() internally grab the NET_LOCK() for the reason you
mentioned.

I'm asking what in aggr_down() and aggr_p_dtor() or respectively
tpmr_down() and tpmr_p_dtor() require the NET_LOCK() and if this could
be done differently.



reposync error

2021-05-17 Thread Hrvoje Popovski
Hi all,

today after sysupgrade i' getting this error with reposync

r620-1# su -m cvs -c "reposync -s src rsync://ftp.hostserver.de/cvs
/home/cvs"
reposync: rsync error: rsync: did not see server greeting
rsync error: error starting client-server protocol (code 5) at
main.c(1814) [Receiver=3.2.3]


i tried reposync on a few hosts and error is the same ..