Re: CVS commit: src/lib/libcurses

2022-11-30 Thread Ryo ONODERA
Hi,

r1.125 of refresh.c breaks /usr/bin/vi for me.
When scrolling down long file with 'j' key and reaches the bottom
line, screen is not scrolled up and new lines are displayed
over the previous bottom line.

Could you take a look at my problem?

Thank you very much.

"Brett Lymn"  writes:

> Module Name:  src
> Committed By: blymn
> Date: Wed Nov 30 06:19:16 UTC 2022
>
> Modified Files:
>   src/lib/libcurses: refresh.c
>
> Log Message:
> When performing a scroll, check the last line of the region is the same
> on virtscr and curscr because the indexes past are supposed to be
> one *past* the last matching line (they may actually match if the line is
> at the bottom of the screen).  Iff they don't match reduce the scroll
> region size by one so we don't scroll non-matching lines, also check
> if the region is then 0 after the decrement and just return if it was.
>
>
> To generate a diff of this commit:
> cvs rdiff -u -r1.124 -r1.125 src/lib/libcurses/refresh.c
>
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
>
> Modified files:
>
> Index: src/lib/libcurses/refresh.c
> diff -u src/lib/libcurses/refresh.c:1.124 src/lib/libcurses/refresh.c:1.125
> --- src/lib/libcurses/refresh.c:1.124 Wed Oct 19 06:09:27 2022
> +++ src/lib/libcurses/refresh.c   Wed Nov 30 06:19:15 2022
> @@ -1,4 +1,4 @@
> -/*   $NetBSD: refresh.c,v 1.124 2022/10/19 06:09:27 blymn Exp $  */
> +/*   $NetBSD: refresh.c,v 1.125 2022/11/30 06:19:15 blymn Exp $  */
>  
>  /*
>   * Copyright (c) 1981, 1993, 1994
> @@ -34,7 +34,7 @@
>  #if 0
>  static char sccsid[] = "@(#)refresh.c8.7 (Berkeley) 8/13/94";
>  #else
> -__RCSID("$NetBSD: refresh.c,v 1.124 2022/10/19 06:09:27 blymn Exp $");
> +__RCSID("$NetBSD: refresh.c,v 1.125 2022/11/30 06:19:15 blymn Exp $");
>  #endif
>  #endif   /* not lint */
>  
> @@ -1852,6 +1852,13 @@ scrolln(int starts, int startw, int curs
>   ox = curscr->curx;
>   n = starts - startw;
>  
> + if (!lineeq(__virtscr->alines[startw]->line,
> + curscr->alines[starts]->line, (size_t) __virtscr->maxx))
> + n--;
> +
> + if (n == 0)
> + return;
> +
>   /*
>* XXX
>* The initial tests that set __noqch don't let us reach here unless
>

-- 
Ryo ONODERA // r...@tetera.org
PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3


Re: CVS commit: src/etc

2022-11-28 Thread Christos Zoulas
In article <20221128024658.71caaf...@cvs.netbsd.org>,
Jan Schaumann  wrote:
>-=-=-=-=-=-
>
>Module Name:   src
>Committed By:  jschauma
>Date:  Mon Nov 28 02:46:58 UTC 2022
>
>Modified Files:
>   src/etc: protocols
>
>Log Message:
>regen from IANA 2022-09-28
>
>
>To generate a diff of this commit:
>cvs rdiff -u -r1.31 -r1.32 src/etc/protocols
>
>Please note that diffs are not public domain; they are subject to the
>copyright notices on the relevant files.
>
>
>-=-=-=-=-=-
>
>Modified files:
>
>Index: src/etc/protocols
>diff -u src/etc/protocols:1.31 src/etc/protocols:1.32
>--- src/etc/protocols:1.31 Thu Apr  8 19:03:43 2021
>+++ src/etc/protocols  Mon Nov 28 02:46:58 2022
>@@ -1,8 +1,10 @@
>-# $NetBSD: protocols,v 1.31 2021/04/08 19:03:43 christos Exp $
>-# See also: protocols(5), http://www.sethwklein.net/projects/iana-etc/
>+# $NetBSD: protocols,v 1.32 2022/11/28 02:46:58 jschauma Exp $
>+# See also: protocols(5), https://www.iana.org/assignments/protocol-numbers/
> #
>+#   
>Protocol Numbers
>+# 
> #Last Updated
>-#2021-02-26
>+#2022-09-28
> # 
> #Available Formats
> #[IMG]
>@@ -50,7 +52,7 @@ igmp   2 IGMP # Internet
> ggp3 GGP  # Gateway-to-Gateway 
>[RFC823]
> ipv4   4 IPv4 # IPv4 encapsulation 
>[RFC2003]
> st 5 ST   # Stream 
>[RFC1190][RFC1819]
>-tcp6 TCP  # Transmission Control   
>[RFC793]
>+tcp6 TCP  # Transmission Control   
>[RFC9293]
> cbt7 CBT  # CBT
>[Tony_Ballardie]
> egp8 EGP  # Exterior Gateway Protocol  
>[RFC888][David_Mills]
> #   any private interior gateway
>@@ -162,9 +164,9 @@ iso-ip80 ISO-IP   # ISO Inte
> vmtp  81 VMTP # VMTP   
>[Dave_Cheriton]
> secure-vmtp   82 SECURE-VMTP  # SECURE-VMTP
>[Dave_Cheriton]
> vines 83 VINES# VINES  
>[Brian Horn]
>-ttp   84 TTP iptm IPTM # Transaction Transport 
>[Jim_Stevens]
>+ttp   84 TTP  # Transaction Transport  
>[Jim_Stevens]
> #   Protocol
>-#iptm  84 IPTM # Internet Protocol Traffic 
>[Jim_Stevens]
>+iptm  84 IPTM # Internet Protocol Traffic  

This will probably make the test fail. It was manually handled before
[the alias to the same number]

>[Jim_Stevens]
> #   Manager
> nsfnet-igp85 NSFNET-IGP   # NSFNET-IGP 
>[Hans_Werner_Braun]
> dgp   86 DGP  # Dissimilar Gateway Protocol
>[M/A-COM Government Systems, "Dissimilar Gateway Protocol
>Specification,
>@@ -205,7 +207,7 @@ ipcomp   108 IPComp   # IP Paylo
> snp  109 SNP  # Sitara Networks Protocol   
>[Manickam_R_Sridhar]
> compaq-peer  110 Compaq-Peer  # Compaq Peer Protocol   
>[Victor_Volpe]
> ipx-in-ip111 IPX-in-IP# IPX in IP  
>[CJ_Lee]
>-vrrp 112 VRRP carp# Virtual Router Redundancy  
>[RFC5798]
>+vrrp 112 VRRP # Virtual Router Redundancy  
>[RFC5798]
> #   Protocol
> pgm  113 PGM  # PGM Reliable Transport 
>[Tony_Speakman]
> #   Protocol
>@@ -238,16 +240,16 @@ rsvp-e2e-ignore 134 RSVP-E2E-IGNORE # [R
> mobility 135 Mobility # Header 
>Y[RFC6275]
> udplite  136 UDPLite  # [RFC3828]
> mpls-in-ip   137 MPLS-in-IP   # [RFC4023]
>-manet138 MANET# MANET Protocols
>[RFC5498]
>+manet138 manet# MANET Protocols

This will also probably make the test fail (alias to the same name)

>[RFC5498]
> hip  139 HIP  # Host Identity Protocol Y   
>[RFC7401]
> shim6140 Shim6# Shim6 Protocol Y   
>[RFC5533]
> wesp 141 WESP # Wrapped Encapsulating  
>[RFC5840]
> #   Security Payload
> rohc 142 ROHC # Robust Header Compression  
>[RFC5858]
> ethernet 143 Ethernet # Ethernet   
>[RFC8986]
>-#144-252Unassigned 
>[Internet_Assigned_Numbers_Authority]
>-pfsync   240 PFSYNC   # PF Synchronization

I would put this back, it is unofficial, but used.

christos



Re: CVS commit: src/sys/dev/i2c

2022-11-24 Thread Brad Spencer
Taylor R Campbell  writes:

[snip]

> The issue here isn't the duration of the delay -- just the mechanism.
> Sleeping for 35ms with kpause(9) is fine, but busy-waiting on the CPU
> for 35ms with delay(9) is not (unless HZ is set to something absurdly
> low like 10 instead of the usual 100, but I doubt that's an important
> use-case to consider here).
>
> In this case, you should use:
>
>   kpause("bmx280", /*intr*/false, MAX(1, mstohz(35)), NULL);

All understood..  the above, however, was not quite right either.  That
delay is related to the over sampling that is set and wasn't a fixed
value.  I made adjustments to scale the amount of time actually needed
to wait with the over sampling setting.  This is the proper thing that
needed to be done in this case.  I do agree that kpause is a better
thing to use here too..  no doubt..

There wasn't any real documentation about this, but it clearly was
something that needed to be considered.  The only reference was a single
table that listed some of the typical and max durations for just *SOME*
of the over sampling setting (worse, yet, this table was removed from
later versions of the docs).  It left out a lot and did not include any
sort of formula.  Though some experimentation I can up with one that
should be workable and allowed for some tuning if it wasn't quite right.

>> was, as one might say, a "surprising development" as the documentation
>> really does not hint that this sort of thing goes on (or was even
>> possible to do).
>
> Can you put a link to the documentation in the source code, and cite
> the sections where you get the complicated formulas like in
> bmx280_compensate_P_int64?

Ya, I can do that.  The three compensation formulas are pulled mostly
literally from the docs.  All I really did was changed the types to
match the kernel names and unglobal'ed some of the variables for the
factory calibration settings.  Peeking at Adafruit's Ardunio code, they
did the same thing.






-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org


Re: CVS commit: src/sys/dev/i2c

2022-11-24 Thread Taylor R Campbell
> Date: Wed, 23 Nov 2022 01:42:13 -0500
> From: Brad Spencer 
> 
> Simon Burge  writes:
> 
> > +   delay(35000);
> >
> > This will spin for 35 milliseconds (per sensor read if I read this
> > correctly).  Can you please look at using kpause(9) for this delay?
> > See some other i2c drivers for examples of this.
> 
> Probably possible, may be a couple of days before I can get to
> it (or not, depends on how holiday prep goes)...
> 
> I have used some of the cv_timedwait stuff in the past and didn't know
> about kpause which seems to be suited to what I need to do.  Almost all
> sensor chips require some sort of wait after a command is sent, but this
> one was particularly frustrating about it.

The issue here isn't the duration of the delay -- just the mechanism.
Sleeping for 35ms with kpause(9) is fine, but busy-waiting on the CPU
for 35ms with delay(9) is not (unless HZ is set to something absurdly
low like 10 instead of the usual 100, but I doubt that's an important
use-case to consider here).

In this case, you should use:

kpause("bmx280", /*intr*/false, MAX(1, mstohz(35)), NULL);

> This
> was, as one might say, a "surprising development" as the documentation
> really does not hint that this sort of thing goes on (or was even
> possible to do).

Can you put a link to the documentation in the source code, and cite
the sections where you get the complicated formulas like in
bmx280_compensate_P_int64?


Re: CVS commit: src/sys/dev/i2c

2022-11-22 Thread Brad Spencer
Simon Burge  writes:

> Hi Brad,
>
>> Module Name: src
>> Committed By:brad
>> Date:Tue Nov 22 19:40:31 UTC 2022
>>
>> Modified Files:
>>
>>  src/sys/dev/i2c: bmx280.c
>>
>> Log Message:
>>
>> Read the datasheet more closely and put in some delays.  The chip will
>> just return junk if the wait is not long enough to allow a measurement
>> to start.
>
>
> +   /* Hmm... this delay is not documented well..  mostly just a guess...
> +* If it is too short, you will get junk returned as it is possible 
> to try
> +* to ask for the data before the chip has even started... it seems...
> +*/
> +   delay(35000);
>
> This will spin for 35 milliseconds (per sensor read if I read this
> correctly).  Can you please look at using kpause(9) for this delay?
> See some other i2c drivers for examples of this.
>
> Cheers,
> Simon.

Probably possible, may be a couple of days before I can get to
it (or not, depends on how holiday prep goes)...

I have used some of the cv_timedwait stuff in the past and didn't know
about kpause which seems to be suited to what I need to do.  Almost all
sensor chips require some sort of wait after a command is sent, but this
one was particularly frustrating about it.

To give out too much information, all of the other sensors I have worked
with NACK the I2C bus during the measurement cycle.  This one, and
probably all of Bosch's chips, do not do that.  You have to read back a
status register and check to see if it is busy doing something or other.
If you just go ahead and read the data registers when it is busy you get
junk with no hint that it is junk.  But a big "however" is that it is
possible (even for a RPI) to read the status register BEFORE the chip
has even started to do its measurements, get a positive response
(i.e. not busy) and then read the data registers and get junk.  This
was, as one might say, a "surprising development" as the documentation
really does not hint that this sort of thing goes on (or was even
possible to do).




-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org


Re: CVS commit: src/sys/dev/i2c

2022-11-22 Thread Simon Burge
Hi Brad,

> Module Name:  src
> Committed By: brad
> Date: Tue Nov 22 19:40:31 UTC 2022
>
> Modified Files:
>
>   src/sys/dev/i2c: bmx280.c
>
> Log Message:
>
> Read the datasheet more closely and put in some delays.  The chip will
> just return junk if the wait is not long enough to allow a measurement
> to start.


+   /* Hmm... this delay is not documented well..  mostly just a guess...
+* If it is too short, you will get junk returned as it is possible to 
try
+* to ask for the data before the chip has even started... it seems...
+*/
+   delay(35000);

This will spin for 35 milliseconds (per sensor read if I read this
correctly).  Can you please look at using kpause(9) for this delay?
See some other i2c drivers for examples of this.

Cheers,
Simon.


Re: CVS commit: src/sys

2022-11-15 Thread Roy Marples

On 15/11/2022 05:35, Ryo Shimizu wrote:

Since l2_sha continues to point outside of m_data manipulated by m_adj(),
it can be corrupted by subsequent m_pullup() or other mbuf m_*() operations.
I still believe that using MTAG is appropriate.

How about something like this? (not tested)


Wow, that looks good and works well.
I've comitted it, thanks.


Since l2_shalen is fixed to ETHER_ADDR_LEN for now, the tag name should be
PACKET_TAG_ETHERNET_SRC instead of PACKET_TAG_L2SHA for now.
If all L2 addresses are to be treated extensively, the structure of mtag
should include the size, but that will not be necessary just yet.


I'm happy with that.

Roy



Re: CVS commit: src/sys/kern

2022-11-15 Thread Michael
Hello,

On Tue, 15 Nov 2022 10:29:56 +
"Michael Lorenz"  wrote:

> Module Name:  src
> Committed By: macallan
> Date: Tue Nov 15 10:29:56 UTC 2022
> 
> Modified Files:
>   src/sys/kern: subr_pserialize.c
> 
> Log Message:
> don't KASSERT(kpreempt_disabled()) while cold
> pserialize_read_*() can be called *very* early in kernel startup these days

... with this clang-built kernels work again on macppc

have fun
Michael


Re: CVS commit: src/sys

2022-11-14 Thread Ryo Shimizu


>> > > This clearly is a layering/abstraction violation and would have been
>> > > good to discuss upfront.
>> > >
>> > > Where do you make use of that information? What about other packet 
>> > > injection
>> > > paths?
>> >
>> > The next commit uses it in if_arp to ensure that the DaD probe sending 
>> > interface
>> > hardware address matches the sending hardware address in the ARP packet as
>> > specified in RFC 5227 section 1.1
>> >
>> > I couldn't think of a better way of achieving this.
>>
>> RFC 5227 says senders must follow the spec but doesn't say receivers'
>> check is must IIUC.
>>
>> I don't think it is a good idea to increase the mbuf size just for
>> broken clients.
>> I think it is better to make the strict check optional (by sysctl or
>> something) and use mtag,
>> so the change doesn't impact on most of us.
>
>Well... another possible option is to unionize l2_* with pattr_*.
>This is possible (IIUC)
>because l2_* are used only on receiving packets while pattr_* are used only on
>sending packets.


Since l2_sha continues to point outside of m_data manipulated by m_adj(),
it can be corrupted by subsequent m_pullup() or other mbuf m_*() operations.
I still believe that using MTAG is appropriate.

How about something like this? (not tested)

git diff -u -u .
diff --git a/sys/net/if_ethersubr.c b/sys/net/if_ethersubr.c
index 34be2d1cf91..18c8c1dcef9 100644
--- a/sys/net/if_ethersubr.c
+++ b/sys/net/if_ethersubr.c
@@ -886,9 +886,17 @@ ether_input(struct ifnet *ifp, struct mbuf *m)
 #endif
}
 
-   /* Store the senders hardware address */
-   m->m_pkthdr.l2_sha = >ether_shost;
-   m->m_pkthdr.l2_shalen = ETHER_ADDR_LEN;
+   //if (arp_do_strict_check_sha)  /* XXX: sysctl? */
+   if (etype == ETHERTYPE_ARP) {
+   /* Store the senders hardware address */
+   struct m_tag *mtag;
+   mtag = m_tag_get(PACKET_TAG_ETHERNET_SRC, ETHER_ADDR_LEN,
+   M_NOWAIT);
+   if (mtag != NULL) {
+   memcpy(mtag + 1, >ether_shost, ETHER_ADDR_LEN);
+   m_tag_prepend(m, mtag);
+   }
+   }
 
/* Strip off the Ethernet header. */
m_adj(m, ehlen);
diff --git a/sys/netinet/if_arp.c b/sys/netinet/if_arp.c
index 90b0be9ff59..3334452c496 100644
--- a/sys/netinet/if_arp.c
+++ b/sys/netinet/if_arp.c
@@ -945,18 +945,23 @@ again:
(in_hosteq(isaddr, myaddr) ||
(in_nullhost(isaddr) && in_hosteq(itaddr, myaddr) &&
 m->m_flags & M_BCAST &&
-ia->ia4_flags & (IN_IFF_TENTATIVE | IN_IFF_DUPLICATED))) &&
-   m->m_pkthdr.l2_shalen == ah->ar_hln && (
-   ah->ar_hln == 0 ||
-   memcmp(m->m_pkthdr.l2_sha, ar_sha(ah), ah->ar_hln) == 0))
+ia->ia4_flags & (IN_IFF_TENTATIVE | IN_IFF_DUPLICATED
{
struct sockaddr_dl sdl, *sdlp;
+   struct m_tag *mtag = NULL;
 
-   sdlp = sockaddr_dl_init(, sizeof(sdl),
-   ifp->if_index, ifp->if_type,
-   NULL, 0, ar_sha(ah), ah->ar_hln);
-   arp_dad_duplicated((struct ifaddr *)ia, sdlp);
-   goto out;
+   //if (arp_do_strict_check_sha)  /* XXX: sysctl? */
+   mtag = m_tag_find(m, PACKET_TAG_ETHERNET_SRC);
+
+   if (mtag == NULL || (ah->ar_hln == ETHER_ADDR_LEN &&
+   memcmp(mtag + 1, ar_sha(ah), ETHER_ADDR_LEN) == 0)) {
+
+   sdlp = sockaddr_dl_init(, sizeof(sdl),
+   ifp->if_index, ifp->if_type,
+   NULL, 0, ar_sha(ah), ah->ar_hln);
+   arp_dad_duplicated((struct ifaddr *)ia, sdlp);
+   goto out;
+   }
}
 
/*
diff --git a/sys/sys/mbuf.h b/sys/sys/mbuf.h
index 6b441b56bd8..bf69372f083 100644
--- a/sys/sys/mbuf.h
+++ b/sys/sys/mbuf.h
@@ -178,8 +178,8 @@ struct m_hdr {
  * checksum) -- this is so we can accumulate the checksum for fragmented
  * packets during reassembly.
  *
- * Size ILP32: 48
- *   LP64: 72
+ * Size ILP32: 40
+ *   LP64: 56
  */
 struct pkthdr {
union {
@@ -203,9 +203,6 @@ struct pkthdr {
int pattr_af;   /* ALTQ: address family */
void*pattr_class;   /* ALTQ: sched class set by classifier 
*/
void*pattr_hdr; /* ALTQ: saved header position in mbuf 
*/
-
-   void*l2_sha;/* l2 sender host address */
-   size_t  l2_shalen;  /* length of the sender address 
*/
 };
 
 /* Checksumming flags (csum_flags). */
@@ -803,6 +800,7 @@ int m_tag_copy_chain(struct mbuf *, struct mbuf *);
*/
 #define PACKET_TAG_MPLS29 /* Indicate it's for MPLS */
 #define PACKET_TAG_SRCROUTE30 /* IPv4 source routing */
+#define PACKET_TAG_ETHERNET_SRC31
 
 /*
  * 

Re: CVS commit: src/sys

2022-11-14 Thread Ryota Ozaki
On Tue, Nov 15, 2022 at 11:29 AM Ryota Ozaki  wrote:
>
> On Mon, Nov 14, 2022 at 8:52 PM Roy Marples  wrote:
> >
> > On 14/11/2022 11:04, Martin Husemann wrote:
> > > This clearly is a layering/abstraction violation and would have been
> > > good to discuss upfront.
> > >
> > > Where do you make use of that information? What about other packet 
> > > injection
> > > paths?
> >
> > The next commit uses it in if_arp to ensure that the DaD probe sending 
> > interface
> > hardware address matches the sending hardware address in the ARP packet as
> > specified in RFC 5227 section 1.1
> >
> > I couldn't think of a better way of achieving this.
>
> RFC 5227 says senders must follow the spec but doesn't say receivers'
> check is must IIUC.
>
> I don't think it is a good idea to increase the mbuf size just for
> broken clients.
> I think it is better to make the strict check optional (by sysctl or
> something) and use mtag,
> so the change doesn't impact on most of us.

Well... another possible option is to unionize l2_* with pattr_*.
This is possible (IIUC)
because l2_* are used only on receiving packets while pattr_* are used only on
sending packets.

Am I missing something?

  ozaki-r


Re: CVS commit: src/sys

2022-11-14 Thread Ryota Ozaki
On Mon, Nov 14, 2022 at 8:52 PM Roy Marples  wrote:
>
> On 14/11/2022 11:04, Martin Husemann wrote:
> > This clearly is a layering/abstraction violation and would have been
> > good to discuss upfront.
> >
> > Where do you make use of that information? What about other packet injection
> > paths?
>
> The next commit uses it in if_arp to ensure that the DaD probe sending 
> interface
> hardware address matches the sending hardware address in the ARP packet as
> specified in RFC 5227 section 1.1
>
> I couldn't think of a better way of achieving this.

RFC 5227 says senders must follow the spec but doesn't say receivers'
check is must IIUC.

I don't think it is a good idea to increase the mbuf size just for
broken clients.
I think it is better to make the strict check optional (by sysctl or
something) and use mtag,
so the change doesn't impact on most of us.

  ozaki-r


Re: CVS commit: src/sys

2022-11-14 Thread Kengo NAKAHARA

Hi,

Thank you for your updating.


Thanks,

On 2022/11/14 19:15, Roy Marples wrote:

On 14/11/2022 09:49, Kengo NAKAHARA wrote:

Hi,

Please update the size in comment, when struct pkthdr is changed.
 https://github.com/NetBSD/src/blob/trunk/sys/sys/mbuf.h#L181


Thanks,


Done, thanks.

Roy


--
//
Internet Initiative Japan Inc.

Device Engineering Section,
Product Division,
Technology Unit

Kengo NAKAHARA 




Re: CVS commit: src/sys

2022-11-14 Thread Roy Marples

On 14/11/2022 11:04, Martin Husemann wrote:

This clearly is a layering/abstraction violation and would have been
good to discuss upfront.

Where do you make use of that information? What about other packet injection
paths?


The next commit uses it in if_arp to ensure that the DaD probe sending interface 
hardware address matches the sending hardware address in the ARP packet as 
specified in RFC 5227 section 1.1


I couldn't think of a better way of achieving this.

Roy


Re: CVS commit: src/sys

2022-11-14 Thread Martin Husemann
On Mon, Nov 14, 2022 at 12:04:20PM +0100, Martin Husemann wrote:
> On Mon, Nov 14, 2022 at 10:15:33AM +, Roy Marples wrote:
> > On 14/11/2022 09:49, Kengo NAKAHARA wrote:
> > > Hi,
> > > 
> > > Please update the size in comment, when struct pkthdr is changed.
> > >      https://github.com/NetBSD/src/blob/trunk/sys/sys/mbuf.h#L181
> > > 
> > > 
> > > Thanks,
> > 
> > Done, thanks.
> 
> This clearly is a layering/abstraction violation and would have been
> good to discuss upfront.
> 
> Where do you make use of that information? What about other packet injection
> paths?

And doesn't this require a kernel version bump?

Martin


Re: CVS commit: src/sys

2022-11-14 Thread Martin Husemann
On Mon, Nov 14, 2022 at 10:15:33AM +, Roy Marples wrote:
> On 14/11/2022 09:49, Kengo NAKAHARA wrote:
> > Hi,
> > 
> > Please update the size in comment, when struct pkthdr is changed.
> >      https://github.com/NetBSD/src/blob/trunk/sys/sys/mbuf.h#L181
> > 
> > 
> > Thanks,
> 
> Done, thanks.

This clearly is a layering/abstraction violation and would have been
good to discuss upfront.

Where do you make use of that information? What about other packet injection
paths?

Martin


Re: CVS commit: src/sys

2022-11-14 Thread Roy Marples

On 14/11/2022 09:49, Kengo NAKAHARA wrote:

Hi,

Please update the size in comment, when struct pkthdr is changed.
     https://github.com/NetBSD/src/blob/trunk/sys/sys/mbuf.h#L181


Thanks,


Done, thanks.

Roy


Re: CVS commit: src/sys

2022-11-14 Thread Kengo NAKAHARA

Hi,

Please update the size in comment, when struct pkthdr is changed.
https://github.com/NetBSD/src/blob/trunk/sys/sys/mbuf.h#L181


Thanks,

On 2022/11/14 18:23, Roy Marples wrote:

Module Name:src
Committed By:   roy
Date:   Mon Nov 14 09:23:42 UTC 2022

Modified Files:
src/sys/net: if_ethersubr.c
src/sys/sys: mbuf.h

Log Message:
net: Store a pointer to the Layer 2 Sender Hardware address in mbuf

The BSD networking stack is designed around passing a mbuf down the chain
and each layer removes the part it's interested in before passing it to
the next. This makes it easy for each layer to do it's work,
but non trivial to work backwards.

As such we now store a pointer to the Senders Hardware address in the
mbuf packet header so that protocols can perform any required validation.


To generate a diff of this commit:
cvs rdiff -u -r1.320 -r1.321 src/sys/net/if_ethersubr.c
cvs rdiff -u -r1.232 -r1.233 src/sys/sys/mbuf.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



--
//
Internet Initiative Japan Inc.

Device Engineering Section,
Product Division,
Technology Unit

Kengo NAKAHARA 




Re: CVS commit: src/sys/dev/tprof

2022-11-10 Thread Ryo Shimizu


>I think this is a bug in your device tree because the KASSERT was 
>intentional:
>
>  - In the ACPI case, we probe for CPU PMU support before calling
>armv8_pmu_init.
>  - In the FDT case, the PMU attaches to a node described in the device
>tree.
>
>So if you hit this KASSERT, AFAICT it means your device tree is describing 
>a device that is not there. Unless I'm missing something here.

I tried to fix it to work properly as a kernel module for debugging and
improving tprof itself, but the current implement is difficult due to
the acpi/fdt pmu interrupt and tprof, so I gave up :-P

I'll revert it. thanks!
-- 
ryo shimizu


Re: CVS commit: src/sys/dev/tprof

2022-11-09 Thread Jared McNeill
I think this is a bug in your device tree because the KASSERT was 
intentional:


 - In the ACPI case, we probe for CPU PMU support before calling
   armv8_pmu_init.
 - In the FDT case, the PMU attaches to a node described in the device
   tree.

So if you hit this KASSERT, AFAICT it means your device tree is describing 
a device that is not there. Unless I'm missing something here.


Take care,
Jared

On Wed, 9 Nov 2022, Ryo Shimizu wrote:


Module Name:src
Committed By:   ryo
Date:   Wed Nov  9 19:06:46 UTC 2022

Modified Files:
src/sys/dev/tprof: tprof_armv8.c

Log Message:
If the hardware does not support PMU, return an error instead of KASSERT.


To generate a diff of this commit:
cvs rdiff -u -r1.14 -r1.15 src/sys/dev/tprof/tprof_armv8.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




re: CVS commit: src/sys/external/bsd/drm2/pci

2022-10-28 Thread matthew green
> Modified Files:
>   src/sys/external/bsd/drm2/pci: drm_pci.c
>
> Log Message:
> drm(4): Mark PCI interrupt handlers as MP-safe.

i tested this recently and on one of nvidia or radeon
it wasn't working for me.  i can confirm again next
weekend...


.mrg.


Re: CVS commit: src

2022-10-27 Thread David H. Gutteridge

Module Name:src
Committed By:   andvar
Date:   Wed Oct 26 21:56:19 UTC 2022

Modified Files:
   src/sys/dev/spi: spiflash.c
   src/usr.sbin/makefs: README
   src/usr.sbin/makemandb: makemandb.c

Log Message:
fix various typos in comments and makefs README file.

[...]

--- a/usr.sbin/makemandb/makemandb.cWed Oct 26 21:18:49 2022 +
+++ b/usr.sbin/makemandb/makemandb.cWed Oct 26 21:56:19 2022 +

[...]

@@ -503,9 +503,9 @@
}

/* build_file_cache --
- *   This function generates an md5 hash of the file passed as its 2nd 
parameter
+ *   This function generates a md5 hash of the file passed as its 2nd 
parameter


Hi,

This was correct before. It should be "an md5", as we'd write "an M".
(The rule is based on pronunciation. That is, everyone I know would say
"em dee five", not "mhd-five".) (If anything could be changed there, it
should perhaps be "MD5".)

Regards,

Dave


Re: CVS commit: src/lib/libc/time

2022-10-26 Thread Robert Elz
Date:Wed, 26 Oct 2022 10:42:15 -0400
From:Jan Schaumann 
Message-ID:  

  | New proposal:

That looks better (it needs some minor wording changes,
there are too many indefinite articles ('a') which aren't
needed, and there midnight should be just that, no hyphen
or space -- but that's all trivia).

kre



Re: CVS commit: src/lib/libc/time

2022-10-26 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Sun, 23 Oct 2022 13:53:01 -0400
> From:Jan Schaumann 
> Message-ID:  
> 
>   | Hmm, maybe something like this?
> 
> I think there is still too much there, you don't have
> to explain everything (or almost anything), but it is
> in the right direction I think.

New proposal:

---

[...] components are not restricted to their normal
ranges and will be normalized, if need be.

For example, consider a struct tm initialized with a
tm_year = 122, tm_mon = 10, tm_mday = 30, tm_hour =
22, tm_min = 57, and a tm_sec = 0.  Incrementing
tm_min by 13 and calling mktime() would lead to a
tm_hour = 23 and tm_min = 10.

This normalizing can lead to cascading changes: Again
using a struct tm initialized as in the above example
but with a tm_hour = 23, the same change would lead to
a tm_mon = 11, tm_mday = 1, tm_hour = 0, and tm_min =
10.

Negative values may also be normalized with similar
cascading effect such that e.g., a tm_hour of −1 means
1 hour before mid‐ night on the previous day and so
on.

---

?

-Jan


re: CVS commit: src/usr.bin/ldd

2022-10-25 Thread Ryo ONODERA
Hi,

matthew green  writes:

>> With this change, ldd /lib/libc.so.12.220 fails under NetBSD/amd64 9.99.101.
>>
>>
>> /lib/libc.so.12.220:
>> ldd: /lib/libc.so.12.220: invalid ELF class 2, expected 1
>>
>> It seems that elf32_ldd() fails.
>>
>> Builds of some pkgsrc packages that use gobject introspection and meson fails
>> because they uses ldd command during build.
>
> this should be fixed now.  sorry for the failure..

I am sorry. I have missed your email.
Thanks for your quick fix!!!

> .mrg.

-- 
Ryo ONODERA // r...@tetera.org
PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3


Re: CVS commit: src/sys/net

2022-10-25 Thread Masanobu SAITOH


On 2022/10/25 14:55, Masanobu SAITOH wrote:
> 
> 
> On 2022/10/25 14:51, matthew green wrote:
>> "SAITOH Masanobu" writes:
>>> Module Name:src
>>> Committed By:   msaitoh
>>> Date:   Mon Oct 24 07:45:44 UTC 2022
>>>
>>> Modified Files:
>>> src/sys/net: if_dl.h
>>>
>>> Log Message:
>>> Increase sdl_data so that more then IFNAMSIZ bytes are available.
>>>
>>>  - Increase the size of dl_data[] from 12 to 24.
>>>  - Same as OpenBSD.
>>
>> isn't this a binary compat issue?  eg, 'struct sockaddr_dl' changes,
>> and that, and things based upon it, are in user interfaces.  i had
>> a look and i believe it's a problem, but maybe i missed something.
>>
>> thanks.
>>
>>
>> .mrg.
> 
> struct dl_addr is at the end of struct sockaddr_dl.
> dl_data is at the end of struct dl_addr.
> So I think it's has no problem for old binaries.

I've roughly tested with old arp, ndp, ifconfig, ifmcstat and all /usr/tests
on a new kernel and I didn't saw any problem.

I think getifaddrs(3) has no problem. The routing message has no problem
because struct rtm_msglen has rtm_msglen and we can get the next message
using with it. I can't see any kernel data structure which has
struct sockaddr_dl foobadr[xxx] array.

One problem might be exist. An old userland program allocate buffer
with sizeof(struct sockaddr_dl), the size is smaller than the newer.
The old program might copy with memcpy(old_sized_buf, newdata, dla->sdl_len).

Should I backout the change until the COMPAT code is written?

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: CVS commit: src/sys/net

2022-10-24 Thread Masanobu SAITOH



On 2022/10/25 14:51, matthew green wrote:
> "SAITOH Masanobu" writes:
>> Module Name: src
>> Committed By:msaitoh
>> Date:Mon Oct 24 07:45:44 UTC 2022
>>
>> Modified Files:
>>  src/sys/net: if_dl.h
>>
>> Log Message:
>> Increase sdl_data so that more then IFNAMSIZ bytes are available.
>>
>>  - Increase the size of dl_data[] from 12 to 24.
>>  - Same as OpenBSD.
> 
> isn't this a binary compat issue?  eg, 'struct sockaddr_dl' changes,
> and that, and things based upon it, are in user interfaces.  i had
> a look and i believe it's a problem, but maybe i missed something.
> 
> thanks.
> 
> 
> .mrg.

struct dl_addr is at the end of struct sockaddr_dl.
dl_data is at the end of struct dl_addr.
So I think it's has no problem for old binaries.

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


re: CVS commit: src/sys/net

2022-10-24 Thread matthew green
"SAITOH Masanobu" writes:
> Module Name:  src
> Committed By: msaitoh
> Date: Mon Oct 24 07:45:44 UTC 2022
>
> Modified Files:
>   src/sys/net: if_dl.h
>
> Log Message:
> Increase sdl_data so that more then IFNAMSIZ bytes are available.
>
>  - Increase the size of dl_data[] from 12 to 24.
>  - Same as OpenBSD.

isn't this a binary compat issue?  eg, 'struct sockaddr_dl' changes,
and that, and things based upon it, are in user interfaces.  i had
a look and i believe it's a problem, but maybe i missed something.

thanks.


.mrg.


Re: CVS commit: src/lib/libc/time

2022-10-24 Thread Steffen Nurpmeso
Taylor R Campbell wrote in
 <20221023170035.2542f60...@jupiter.mumble.net>:
 ...
 |If you use a monotonic timer to sample the POSIX clock before and
 |after a leap second, the POSIX clock will appear to have taken twice
 |as long as it should to pass the leap second.

Just to note that the next leap second could be a negative one.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


Re: CVS commit: src/lib/libc/time

2022-10-24 Thread Robert Elz
Date:Sun, 23 Oct 2022 13:53:01 -0400
From:Jan Schaumann 
Message-ID:  

  | Hmm, maybe something like this?

I think there is still too much there, you don't have
to explain everything (or almost anything), but it is
in the right direction I think.

  | For example, consider a struct tm initialized with a
  | tm_year = 122, tm_mon = 11, tm_mday = 30, tm_hour =
  | 22, tm_min = 57, and a tm_sec = 0, using UTC,
  | representing 2022‐12‐31T22:57:00Z.

That last bit (the ISO format spec of the time) isn't
really needed, but doesn't hurt either - but I would avoid
an example that touches anywhere near midnight Dec 31, UTC,
(which this one does later, if not here) as that's exactly
where leap seconds start to appear, and (as illustrated in
the side-discussion in this thread) that's where things get messy.
Best to just avoid that  (Avoid June 30 for the same reason.)

  | Incrementing
  | tm_min by 13 and calling mktime() would yield a time_t
  | with value 1672528200,

That's irrelevant, the time_t returned isn't what's really
interesting here, and its binary value is way too much (useless)
information, it is the modification made to the tm that matters.
It is already clear enough in the doc (I think) that the result
from mktime() is the time_t for the normalised tm.

  | representing 2022‐12‐31T23:10:00Z,

this, or just give the adjusted tm_min and tm_hour
values, there's no need for both, and

  | as the tm_min = 70 was
  | normalized to an increment of tm_hour by one and
  | tm_min = 10.

no need for the explanation of how it was done.

  | This normalizing can lead to cascading changes: Again
  | using a struct tm initialized as in the above example
  | but with a tm_hour = 23, the same change would yield a
  | time_t with value 1672531800, representing
  | 2023‐01‐01T00:10:00Z

That's the adjustment we want to avoid, as it gets right into
what happens if we're observing leap seconds, and one was to
happen in that period.There's no need to show the year being
incremented, showing the month going up would be enough, readers
ought to be able to deduce that if the month changes from Dec to
Jan then the year would be incremented by one.

  | the normalization of tm_min incremented tm_hour,

This explanation is not needed, but if it were, that
would be correct, but

  | which lead to a normalization of tm_mday, which in
  | turn caused tm_mon to be normalized,

but not those.   Those fields (in this example) were
already within the appropriate range, they don't need
to be normalised, they're simply adjusted, or as in
this (or tm_hour above):

  | which in turn lead to the increment of tm_year.

  | In addition,

That's perhaps poor wording here, "addition" followed
immediately by "negative", it could be "Also" which
avoids this, but this lead in clause is not really
needed at all, just begin like:

  | negative values may also be normalized,
  | such that e.g., a tm_hour of −1 means 1 hour before
  | midnight, tm_mday of 0 means the day preceding the
  | current month, and tm_mon of −2 means 2 months before
  | January of tm_year.

Again, too much there, we don't need examples of everything.

I still feel though that an example with more than one adjustment
to a tm returned by localtime (though how it was originally
created is irrelevant) and which would affect Feb 29 were
it a leap year, would be worth giving (in both the leap
year and non leap year cases) so we get Feb 29 in one
case and Mar 1 in the other, from an adjustment that
affects months (and days, either directly, or as a flow on
from hours, mins, or secs) - like going back a month from
Mar 28, then forward to the next day, and sometimes still
being in Mar.

  | The fact that mktime(2) returns a time_t makes the
  | phrasing even more awkward,

Yes, as above.

  | and I suppose we could
  | leave out the actual value and say "would yield a
  | time_t representing..."?

No, nothing about the result returned at all, this should
all just be about what normalising the tm causes to happen.
Those values are not just internal, the tm passed (via ref)
is modified if needed before mktime returns.

kre



Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Warner Losh
On Sun, Oct 23, 2022 at 11:00 AM Taylor R Campbell <
campbell+netbsd-source-change...@mumble.net> wrote:

> > Date: Sun, 23 Oct 2022 07:39:25 -0600
> > From: Warner Losh 
> >
> > I guess a more accurate way of saying this is that leap seconds simply
> > aren't reliable, cannot be made reliable, and this affects normalization
> > in ways too numerous to mention due to the details of the tz files, bugs
> > in the system, and lack of others to implement them correctly.
>
> I think you mean `POSIX clocks simply aren't reliable'.  They _could_
> be made reliable, though, by fixing the the calendar arithmetic
> formula in POSIX for mapping time_t to and from UTC -- just like POSIX
> already fixed the bug in its Gregorian leap year formula.
>

Except they can't, at least not practically enough to be a standard. The
Gregorian Leap Year formula is a mathematical formula that needs no
further data other than the broken down time to compute. It's not an
observational calendar, but a computational or arithmetic one. UTC is an
observational calendar. We barely know if there's going to be a leap
second in the coming months, and have nothing more than a vague notion
of when the one after that might be. You must have a table of all past
leap seconds to do any kind of sensible mapping. And you also must
have some way to keep that up to date, even when machines are
powered off, or installed from not really that old media (anything older
than 6 months can't possibly have the right leap table, except by
chance). And then the question becomes how do you get it, do you
assume connectivity, some standard media format, some standard file
format, etc. All of these details means POSIX can't really fix this. And
even if they do, the current formula has been around so long there's a
lot of dusty decks of code that will likely silently break. You can
ameliorate
that somewhat by inventing new interfaces, but issues like the one you go
into below will still persist.


> > > The code works with either set of tzdata files, POSIX stretchy secs,
> > > or UTC with leap secs - claiming that one doesn't happen, or cannot
> > > happen, isn't really correct.
> >
> > Yea, and even 'posix stretchy sec' is really a misnomer. POSIX simply
> > counts time without leap seconds. Each second is the same length,
>
> Not at all.
>
> If you use a monotonic timer to sample the POSIX clock before and
> after a leap second, the POSIX clock will appear to have taken twice
> as long as it should to pass the leap second.
>
> Of course, it's worse.  If sampled at _subsecond_ intervals, a POSIX
> clock behaves _erratically_: it spontaneously rewinds itself!
>
> Suppose we have a machine with a monotonic clock that counts SI
> seconds as well as a POSIX clock:
>
> SI monotonicPOSIX
> 123.25  1483228799.00
> 123.50  1483228799.25
> 123.75  1483228799.50  # t0 = boottime + 123.75
> 124.00  1483228799.75
> 124.25  1483228800.00  # leap second begins at 2016-12-31T23:59:60Z
> 124.50  1483228800.25
> 124.75  1483228800.50
> 125.00  1483228800.75  # t1 = boottime + 125.00
> 125.25  1483228800.00  # POSIX clock rewinds at
> 2017-01-01T00:00:00Z!
> 125.50  1483228800.25
> 125.75  1483228800.50  # t2 = boottime + 125.75
> 126.00  1483228800.75
>
> At supersecond resolution, t2 - t0 is a duration of 2 SI seconds, but
> a POSIX clock reports a time difference POSIX(t2) - POSIX(t0) of 1, so
> `POSIX seconds' are not always SI seconds -- it is not the case that
> `each [POSIX] second is the same length', even ignoring physical clock
> sampling error.
>

Right. Except during that brief interval around a leap second, all the
seconds
are the same size. They aren't expanded or contracted by running the clock
at a different frequency that was done between approx 1960-1972 which is
often referred to as the rubber leap second era. It was more on that basis
that
I was objecting to the turn of phrase because that sort of thing isn't
happening.


> And, of course, at subsecond resolution, the POSIX clock rewinds.  The
> monotonic clock correctly has t1 < t2, but POSIX(t1) > POSIX(t2).  And
> this erratic behaviour is much worse than a typical NTP-driven clock
> adjustment at random times, because by design this erratic behaviour
> happens on ~every computer on the planet simultaneously!
>

Yea, if NTP knows about the leap, it can deal with it. The problem as you
say comes in when the stratum 1 servers don't announce the leap second
far enough in advance for the implementations to cope. It then devolves to
the 'when, exactly, do you step the time back' problem since there's a
couple
of choices, unless you have the 'leap smear' ntp servers which do it over
a few hours.


> There's no need for this nonsense except insistence on the formula
> that says every UTC day is counted by 86400 `POSIX seconds'.  POSIX
> could be revised to fix this bug in the clock by just not doing civil
> calendar 

Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Taylor R Campbell
> Date: Sun, 23 Oct 2022 12:19:37 -0600
> From: Warner Losh 
> 
>  You must have a table of all past
> leap seconds to do any kind of sensible mapping. And you also must
> have some way to keep that up to date, [...]

It's the same for all civil calendar matters with political changes
around summer time, but at a smaller scale: with an outdated tzdb,
where localtime might be off for an hour, mktime might be off by a few
seconds.

> > And, of course, at subsecond resolution, the POSIX clock rewinds.  The
> > monotonic clock correctly has t1 < t2, but POSIX(t1) > POSIX(t2).  And
> > this erratic behaviour is much worse than a typical NTP-driven clock
> > adjustment at random times, because by design this erratic behaviour
> > happens on ~every computer on the planet simultaneously!
> 
> Yea, if NTP knows about the leap, it can deal with it. The problem as you
> say comes in when the stratum 1 servers don't announce the leap second
> far enough in advance for the implementations to cope. It then devolves to
> the 'when, exactly, do you step the time back' problem since there's a
> couple
> of choices, unless you have the 'leap smear' ntp servers which do it over
> a few hours.

I mentioned NTP-driven clock adjustments as an example of _other_
times that a POSIX clock might rewind.  Such adjustments arise from
local clock drift and so occur randomly, in contrast to the rewinding
on leap seconds that is globally coordinated as a deliberate design
decision of POSIX leading to simultaneous worldwide computer system
failures.

NTP doesn't have to know about leap seconds at all, nor does POSIX.
They could both just report the number of SI seconds (averaged in the
usual TAI way) since an appropriate epoch, and all the leap second
adjustment logic could be ripped out and moved to tzdb and mktime just
like time zones and summer time.

The fundamental problem with a POSIX clock -- and to a lesser extent
NTP, although NTP at least transmits additional information to work
around it during a leap second transition unlike POSIX programs -- is
that it goes out of its way to step back the _internal concept of
timecounting_ (tick-tick-tick, counting what everyone expects to be SI
seconds) just to deal with an _external civil calendar presentation_,
like the summer time change.  And it makes sure to do so
simultaneously on all computers worldwide.  That's a spectacularly bad
design leading to tremendously costly mistakes.  Leap seconds aren't a
big deal; the erratic design of POSIX clocks is.


Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Sat, 22 Oct 2022 13:42:54 -0400
> From:Jan Schaumann 
> Message-ID:  
> 
>   | A full set of examples strikes me as too much here
> 
> A full set would be yes, but I think we could handle 3.

Hmm, maybe something like this?


...and will be normalized, if need be.

For example, consider a struct tm initialized with a
tm_year = 122, tm_mon = 11, tm_mday = 30, tm_hour =
22, tm_min = 57, and a tm_sec = 0, using UTC,
representing 2022‐12‐31T22:57:00Z.  Incrementing
tm_min by 13 and calling mktime() would yield a time_t
with value 1672528200, representing
2022‐12‐31T23:10:00Z, as the tm_min = 70 was
normalized to an increment of tm_hour by one and
tm_min = 10.

This normalizing can lead to cascading changes: Again
using a struct tm initialized as in the above example
but with a tm_hour = 23, the same change would yield a
time_t with value 1672531800, representing
2023‐01‐01T00:10:00Z: the normalization of tm_min
incremented tm_hour, which lead to a normalization
of tm_mday, which in turn caused tm_mon to be
normalized, which in turn lead to the increment of
tm_year.

In addition, negative values may also be normalized,
such that e.g., a tm_hour of −1 means 1 hour before
midnight, tm_mday of 0 means the day preceding the
current month, and tm_mon of −2 means 2 months before
January of tm_year.
---

The fact that mktime(2) returns a time_t makes the
phrasing even more awkward, and I suppose we could
leave out the actual value and say "would yield a
time_t representing..."?

Other ideas to improve this?

-Jan


Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Taylor R Campbell
> Date: Sun, 23 Oct 2022 07:39:25 -0600
> From: Warner Losh 
> 
> I guess a more accurate way of saying this is that leap seconds simply
> aren't reliable, cannot be made reliable, and this affects normalization
> in ways too numerous to mention due to the details of the tz files, bugs
> in the system, and lack of others to implement them correctly.

I think you mean `POSIX clocks simply aren't reliable'.  They _could_
be made reliable, though, by fixing the the calendar arithmetic
formula in POSIX for mapping time_t to and from UTC -- just like POSIX
already fixed the bug in its Gregorian leap year formula.

> > The code works with either set of tzdata files, POSIX stretchy secs,
> > or UTC with leap secs - claiming that one doesn't happen, or cannot
> > happen, isn't really correct.
> 
> Yea, and even 'posix stretchy sec' is really a misnomer. POSIX simply
> counts time without leap seconds. Each second is the same length,

Not at all.

If you use a monotonic timer to sample the POSIX clock before and
after a leap second, the POSIX clock will appear to have taken twice
as long as it should to pass the leap second.

Of course, it's worse.  If sampled at _subsecond_ intervals, a POSIX
clock behaves _erratically_: it spontaneously rewinds itself!

Suppose we have a machine with a monotonic clock that counts SI
seconds as well as a POSIX clock:

SI monotonicPOSIX
123.25  1483228799.00
123.50  1483228799.25
123.75  1483228799.50  # t0 = boottime + 123.75
124.00  1483228799.75
124.25  1483228800.00  # leap second begins at 2016-12-31T23:59:60Z
124.50  1483228800.25
124.75  1483228800.50
125.00  1483228800.75  # t1 = boottime + 125.00
125.25  1483228800.00  # POSIX clock rewinds at 2017-01-01T00:00:00Z!
125.50  1483228800.25
125.75  1483228800.50  # t2 = boottime + 125.75
126.00  1483228800.75

At supersecond resolution, t2 - t0 is a duration of 2 SI seconds, but
a POSIX clock reports a time difference POSIX(t2) - POSIX(t0) of 1, so
`POSIX seconds' are not always SI seconds -- it is not the case that
`each [POSIX] second is the same length', even ignoring physical clock
sampling error.

And, of course, at subsecond resolution, the POSIX clock rewinds.  The
monotonic clock correctly has t1 < t2, but POSIX(t1) > POSIX(t2).  And
this erratic behaviour is much worse than a typical NTP-driven clock
adjustment at random times, because by design this erratic behaviour
happens on ~every computer on the planet simultaneously!

There's no need for this nonsense except insistence on the formula
that says every UTC day is counted by 86400 `POSIX seconds'.  POSIX
could be revised to fix this bug in the clock by just not doing civil
calendar adjustments in the basic clock that goes tick-tick-tick for
counting what most people think are going to be SI seconds.  For the
time_t<->UTC conversion in libc, machines with out-of-date tzdb would
just be off by a few seconds sometimes, no worse than being off by an
hour in the time_t<->localtime conversion with an out-of-date tzdb
across an updated summer time change.


Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Warner Losh
On Sat, Oct 22, 2022 at 10:55 PM Robert Elz  wrote:

> Date:Sat, 22 Oct 2022 20:17:57 -0600
> From:Warner Losh 
> Message-ID:  <
> canczdfqhkz0xs7lf6lmjveontc5dgsonons_ug6fzsf30og...@mail.gmail.com>
>
>
>   | On the other hand, adding a caveat that leap seconds don't exist in a
> POSIX
>   | time_t and that's necessarily reflected in the normalization wouldn't
> be
>   | bad.
>
> The problem with doing that is that while we don't install it this
> way, it is possible to install tzdata such that leap seconds do occur
> (despite that not being posix) either unconditionally, or installing
> both and leaving it up to the user by their choice of timezone to use
> (which makes rather a mess).
>

That's true. In the default "UTC system time" clock, leap seconds simply
do no work at all. With the tzdata with leap seconds, they work for a
limited
subset of things, but not quite everything because you can't completely
close the gap due to time_t not having an encoding for them. Having tried
to deploy systems that needed to present a pedantically-correct UTC
time to the outside world shows many hole (most of which self correct
and are mostly right not near leap seconds).

I guess a more accurate way of saying this is that leap seconds simply
aren't reliable, cannot be made reliable, and this affects normalization
in ways too numerous to mention due to the details of the tz files, bugs
in the system, and lack of others to implement them correctly.


> The code works with either set of tzdata files, POSIX stretchy secs,
> or UTC with leap secs - claiming that one doesn't happen, or cannot
> happen, isn't really correct.
>

Yea, and even 'posix stretchy sec' is really a misnomer. POSIX simply
counts time without leap seconds. Each second is the same length,
and it doesn't make them longer or shorter. It just assumes some
ad-hoc, beyond the standard method for synchronization. The quality
of these ad-hoc methods ranges from terrible to halfway-decent, but
none are perfect.

OK. I've got my annual leap seconds rant out of the way. Even 15 years
after shipping these systems, the almost (but not completely) right nature
of how computers implement them still has my knee jerking just a little
too much...

Warner


Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Warner Losh
On Sat, Oct 22, 2022, 8:05 PM Robert Elz  wrote:

> Date:Sat, 22 Oct 2022 13:42:54 -0400
> From:Jan Schaumann 
> Message-ID:  
>
>   | A full set of examples strikes me as too much here
>
> A full set would be yes, but I think we could handle 3.
> They don't need to be C code examples, just something
> like
> if tm_year==2022 tm_mon==10 tm_mday==19
>tm_hour==23 tm_min==58 tm_sec==17 and tm_dst==-1
>
> and tm_min was incremented as tm_min+=180, and then
> mktime() called upon the result, the struct tm would
> now have tm_min==1 tm_hour==24 and tm_mday==20, with
> tm_isdst 0 or 1 depending upon the local time zone.
> The other fields shown would not be altered in this
> example, the fields of struct tm not show here would
> (tm_wday, ...) would be filled in with appropriate values.
>

That sounds like a good idea...

[Using tm_min rather than tm_sec to avoid needing to include
> caveats about possible leap seconds, which, at least currently,
> could never happen in November, but all of that is too much
> to bother with.]
>

On the other hand, adding a caveat that leap seconds don't exist in a POSIX
time_t and that's necessarily reflected in the normalization wouldn't be
bad. It's a quite common misconception that somehow they do because they
exist in UTC... while you can encode any valid UTC time, they are always
normalized away.

The other examples could be even shorter, no need to repeat
> the final sentence about the other field changes or otherwise.
>
>   | Perhaps:
>   |
>   | This normalization is done in order from tm_sec up to
>   | tm_year,
>
> No, that is what you cannot say, because it is wrong.
>
> We do not always start from tm_sec, and tm_mon is *always*
> normalised (with its possible flow on to tm_year) before
> tm_mday - nothing else makes sense.   tm_year isn't
> "normalized" at all, though it might be adjusted.
>
> But all of this is what we already agreed is too much
> to explain.   And no matter how simple it makes things,
> the man pages cannot (deliberately) lie.
>

Agreed. These nuances are important.

  | possibly leading to cascading values.  That
>   | is, a tm_sec value of e.g., 185 with a tm_min value of
>   | 58 could lead to an increment of tm_min by three, and
>   | thus further incrementing tm_hour by one, and so on.
>
> This is more explaining how it works, which if we do at all,
> we need to do correctly.   That's where a few examples can
> be better, they allow the reader to deduce what probably
> happens, the weird one is there to show that the results
> will not always be what is expected at first glance, but
> that they always are correct.
>
>   | As with most things relating to
>   | time, dates, and calendars, the full details of these
>   | side effects are often non‐obvious, and it may be best
>   | to avoid such scenarios.
>
> And that part is just useless, unless you mean that non-normalised
> values should never be used, perhaps just in some fields (and it
> isn't that anyway, a big enough increment of tm_sec is the same
> as an increment of tm_mday or tm_mon) in which case we should just
> say that.   If not that, then hinting that it is sometimes odd,
> and those cases, which ones we refuse to tell you, should be
> avoided is a complete waste of time.   Which scenarios should be
> avoided?   The answer really is none of them, the answer is
> always correct, it just might not be what the user expected.
>


All good points.

Warner

> kre
>
>


Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Robert Elz
Date:Sat, 22 Oct 2022 20:17:57 -0600
From:Warner Losh 
Message-ID:  



  | On the other hand, adding a caveat that leap seconds don't exist in a POSIX
  | time_t and that's necessarily reflected in the normalization wouldn't be
  | bad.

The problem with doing that is that while we don't install it this
way, it is possible to install tzdata such that leap seconds do occur
(despite that not being posix) either unconditionally, or installing
both and leaving it up to the user by their choice of timezone to use
(which makes rather a mess).

The code works with either set of tzdata files, POSIX stretchy secs,
or UTC with leap secs - claiming that one doesn't happen, or cannot
happen, isn't really correct.

kre




Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Robert Elz
Date:Sun, 23 Oct 2022 08:33:18 +0700
From:Robert Elz 
Message-ID:  <7638.1666488...@jacaranda.noi.kre.to>

  | and tm_min was incremented as tm_min+=180, and then
  | mktime() called upon the result, the struct tm would
  | now have tm_min==1 tm_hour==24 and tm_mday==20, with

That is nonsense of course, but I think you get the idea
(and being nonsense shows why the examples should be
tested )

kre


Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Robert Elz
Date:Sat, 22 Oct 2022 13:42:54 -0400
From:Jan Schaumann 
Message-ID:  

  | A full set of examples strikes me as too much here

A full set would be yes, but I think we could handle 3.
They don't need to be C code examples, just something
like
if tm_year==2022 tm_mon==10 tm_mday==19
   tm_hour==23 tm_min==58 tm_sec==17 and tm_dst==-1

and tm_min was incremented as tm_min+=180, and then
mktime() called upon the result, the struct tm would
now have tm_min==1 tm_hour==24 and tm_mday==20, with
tm_isdst 0 or 1 depending upon the local time zone.
The other fields shown would not be altered in this
example, the fields of struct tm not show here would
(tm_wday, ...) would be filled in with appropriate values.

[Using tm_min rather than tm_sec to avoid needing to include
caveats about possible leap seconds, which, at least currently,
could never happen in November, but all of that is too much
to bother with.]

The other examples could be even shorter, no need to repeat
the final sentence about the other field changes or otherwise.

  | Perhaps:
  |
  | This normalization is done in order from tm_sec up to
  | tm_year,

No, that is what you cannot say, because it is wrong.

We do not always start from tm_sec, and tm_mon is *always*
normalised (with its possible flow on to tm_year) before
tm_mday - nothing else makes sense.   tm_year isn't
"normalized" at all, though it might be adjusted.

But all of this is what we already agreed is too much
to explain.   And no matter how simple it makes things,
the man pages cannot (deliberately) lie.

  | possibly leading to cascading values.  That
  | is, a tm_sec value of e.g., 185 with a tm_min value of
  | 58 could lead to an increment of tm_min by three, and
  | thus further incrementing tm_hour by one, and so on.

This is more explaining how it works, which if we do at all,
we need to do correctly.   That's where a few examples can
be better, they allow the reader to deduce what probably
happens, the weird one is there to show that the results
will not always be what is expected at first glance, but
that they always are correct.

  | As with most things relating to
  | time, dates, and calendars, the full details of these
  | side effects are often non‐obvious, and it may be best
  | to avoid such scenarios.

And that part is just useless, unless you mean that non-normalised
values should never be used, perhaps just in some fields (and it
isn't that anyway, a big enough increment of tm_sec is the same
as an increment of tm_mday or tm_mon) in which case we should just
say that.   If not that, then hinting that it is sometimes odd,
and those cases, which ones we refuse to tell you, should be
avoided is a complete waste of time.   Which scenarios should be
avoided?   The answer really is none of them, the answer is
always correct, it just might not be what the user expected.

kre



Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Jan Schaumann
Robert Elz  wrote:
> jo...@bec.de said:
>   | I'd actually weasle out a bit
> 
> Yes, I would too, but

A full set of examples strikes me as too much here
still, and I do like the idea of weaseling out.

Perhaps:

This normalization is done in order from tm_sec up to
tm_year, possibly leading to cascading values.  That
is, a tm_sec value of e.g., 185 with a tm_min value of
58 could lead to an increment of tm_min by three, and
thus further incrementing tm_hour by one, and so on.
Likewise, negative values can lead to decrementing
other tm(3) fields.  As with most things relating to
time, dates, and calendars, the full details of these
side effects are often non‐obvious, and it may be best
to avoid such scenarios.

?

-Jan


Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Robert Elz
Date:Fri, 21 Oct 2022 15:00:41 -0400
From:Jan Schaumann 
Message-ID:  

  | I believe that it's useful for people to know that
  | values outside the normal ranges are normalized,

Agreed.

  | as well as what that might look like.

The problem with trying to specify that, as we have
discovered, is that it is not easy.   The problem with
only doing half of the job here, is that sometime later
someone will read it, see what it says happens, and
what it doesn't say, and then expect the code to do
that - when it probably doesn't.

I'll come back to the rest of your message a bit lower,
but first a brief interlude ...

jo...@bec.de said:
  | I'd actually weasle out a bit

Yes, I would too, but

jo...@bec.de continued:
  | and just declare that normalisation is best effort and we
  | don't guarantee behavior for values that are very much outside
  | the range of the corresponding field. E.g. anything where
  | numerical overflow can happen in a component.

that isn't really the problem.  tzcode is very careful with arithmetic
to avoid those kinds of problems - if there's anywhere that it isn't
(which is still possible) that would be fixed if pointed out.

The problem is just the complexity of specifying what does happen,
in a way that fits the manual page in a way that is easier to
follow that simply reading the source (which isn't all that easy).

Now we return to the originally scheduelled e-mail:

jscha...@netmeister.org said:
  | That helps a lot when trying to determine _why_ the
  | output of mktime(3) [...]

  | Noting that values may cascade is useful [...]

I wonder if rather than attempting to specify what happens,
it might be better in this case to give examples of using
struct tm alterations, followed by mktime(), to alter the time,
and what the results are.

I'm not generally much in favour of examples in man pages,
I prefer specification, and leave it up to the reader to
determine how to use things to achieve the desired result,
as while giving an example or two (or three) of what can be
done is great for people who want to do one of those exact
things (and so can just copy the example) but is useless for
someone whose needs are different than any of the examples,
and is left trying to guess what will happen.

But here a two or three good examples of what happens might
just be a sane solution, perhaps two fairly simple ones, but
which demonstrate what happens with addition, and subtraction
- using just one field modification, but having that result in
a cascade through a few of the other fields - perhaps the 180
seconds forward, starting at 23:58 on the last day of some
month, and 32 hours backwards starting at 01:15 on the second
of a month.Those should both produce entirely reasonable
and understandable results.

The third should show multiple modifications, in a way that produces,
or can do, a surprising result, perhaps 14 days forward, also
2 months forward, and some irrelevant small number of minutes backward
(that just to show that multiple fields can be altered, and they don't
need to all go in the same direction) starting at 13:27 on Dec 15 - where
the result can be Feb 29, or Mar 1 (at something a bit earlier than 13:27,
perhaps even make the number of minutes subtracted > 27 in that example)
depending upon whether the year is a leap year or not.   (Or make it
6 months forward (plus...) starting from August).  This one should be
one full example, with something like "if the starting year had been
 then the result would have been" added to show the difference
between leap years and others.

If this is done, the examples should all be tested to verify the
claimed results, but the man page should not need a lot of explanation
of why the result shown is achieved, just the input, the modification
made, and the result.

kre



Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Joerg Sonnenberger
Am Fri, Oct 21, 2022 at 02:06:32PM +0700 schrieb Robert Elz:
> Date:Fri, 21 Oct 2022 03:05:15 +
> From:"Jan Schaumann" 
> Message-ID:  <20221021030515.cdc9df...@cvs.netbsd.org>
> 
>   | Note normalizing behavior of mktime(3) using language from FreeBSD.
> 
> If we are going to start specifying what happens (how the struct tm
> is normalised) then we really also need to specify in which order
> the fields are corrected.   It makes a difference.

I'd actually weasle out a bit and just declare that normalisation is
best effort and we don't guarantee behavior for values that are very
much outside the range of the corresponding field. E.g. anything where
numerical overflow can happen in a component.

Joerg


Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Fri, 21 Oct 2022 11:54:08 -0400
> From:Jan Schaumann 
> Message-ID:  
> 
>   | Right, but all that strikes me as too much to put into
>   | the manual page here.
> 
> I agree, which is why I wondered if we should be giving any
> details about how it is done at all.

I believe that it's useful for people to know that
values outside the normal ranges are normalized, as
well as what that might look like.

That helps a lot when trying to determine _why_ the
output of mktime(3) with a tm_sec = 180 yields no
error and a value of tm_min += 3.  I don't think many
people would want to go and dig up the sources to make
sense of this.

Noting that values may cascade is useful (because
that, too, is not obvious), but seems to me the limit
of detail desired here, while still being useful.

-Jan


Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Robert Elz
Date:Fri, 21 Oct 2022 11:54:08 -0400
From:Jan Schaumann 
Message-ID:  

  | Right, but all that strikes me as too much to put into
  | the manual page here.

I agree, which is why I wondered if we should be giving any
details about how it is done at all.

  | I think what matters in this context is that there is
  | an order and that there is a cascading effect.

Perhaps, but

  | "This normalization is done in order from tm_sec to
  | tm_mon (inclusive), possibly leading to cascading
  | values."

As a generic order, that's backwards.   Really we need to
go from year (which is always OK) to month (which can always
be corrected simply, assuming the Gregorian calendar is in
effect, which we assume it is, even for years before that
calendar was created).   Once we know the year and month,
but not before, the day of month can be corrected (which
might then affect the month and year, but only once).

The code doesn't do it in quite that order, it does the
secs mins hours first, but doing it that way ensures that
the hours correction cannot affect the day, after it has
been corrected, the day(of month) always gets set last
(leap seconds excepted).

It really is moderately complicated - trying to explain it
at all risks being simply wrong, or so vague as to be
completely useless.I'm not sure that what was there
before (no details at all, just (in other words) "it gets
fixed") wasn't as good as is reasonable to expect.

kre



Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Fri, 21 Oct 2022 10:36:23 -0400
> From:Jan Schaumann 
> Message-ID:  
> 
> 
>   | Perhaps like this?
> 
> Like that, yes, but as you wrote it isn't how it is actually
> done I believe.
> 
> The order looks to be secs, mins, hours, month, day(of month).
> 
> There's no normalisation of the year (no invalid values to correct)
> though the year can be adjusted when the month is adjusted.
> 
> That order is assuming that the system is using posix stretchy seconds,
> rather than UTC (with leap seconds), but almost all NetBSD systems
> work that way - if UTC is being used, then secs cannot be done first,
> as until the rest of the data/time is known, it is unknown whether
> the number of seconds permitted is 59 60 or 61.   With POSIX seconds
> the secs/mins/hours correction order doesn't really matter, as there
> are always 0..59 secs, 0..59 mins, and 0..23 hours, the rest of the
> fields have no impact at all.

Right, but all that strikes me as too much to put into
the manual page here.  I think what matters in this
context is that there is an order and that there is a
cascading effect.

"This normalization is done in order from tm_sec to
tm_mon (inclusive), possibly leading to cascading
values."

or

"This normalization is done in order from tm_sec up to
tm_year, possibly leading to cascading values."

provides sufficient information, I think?

-Jan


Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Robert Elz
Date:Fri, 21 Oct 2022 10:36:23 -0400
From:Jan Schaumann 
Message-ID:  


  | Perhaps like this?

Like that, yes, but as you wrote it isn't how it is actually
done I believe.

The order looks to be secs, mins, hours, month, day(of month).

There's no normalisation of the year (no invalid values to correct)
though the year can be adjusted when the month is adjusted.

That order is assuming that the system is using posix stretchy seconds,
rather than UTC (with leap seconds), but almost all NetBSD systems
work that way - if UTC is being used, then secs cannot be done first,
as until the rest of the data/time is known, it is unknown whether
the number of seconds permitted is 59 60 or 61.   With POSIX seconds
the secs/mins/hours correction order doesn't really matter, as there
are always 0..59 secs, 0..59 mins, and 0..23 hours, the rest of the
fields have no impact at all.

kre



Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Fri, 21 Oct 2022 03:05:15 +
> From:"Jan Schaumann" 
> Message-ID:  <20221021030515.cdc9df...@cvs.netbsd.org>
> 
>   | Note normalizing behavior of mktime(3) using language from FreeBSD.
> 
> If we are going to start specifying what happens (how the struct tm
> is normalised) then we really also need to specify in which order
> the fields are corrected.   It makes a difference.

Perhaps like this?


Index: ctime.3
===
RCS file: /cvsroot/src/lib/libc/time/ctime.3,v
retrieving revision 1.65
diff -u -p -r1.65 ctime.3
--- ctime.3 21 Oct 2022 03:08:29 -  1.65
+++ ctime.3 21 Oct 2022 14:34:39 -
@@ -264,6 +264,11 @@ of 0 means the day preceding the current
 .Fa tm_mon
 of \-2 means 2 months before January of
 .Fa tm_year .
+This normalization is done in order from
+.Fa tm_sec
+to
+.Fa tm_year ,
+possibly leading to cascading values.
 (A positive or zero value for
 .Fa tm_isdst
 causes


Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Robert Elz
Date:Fri, 21 Oct 2022 03:05:15 +
From:"Jan Schaumann" 
Message-ID:  <20221021030515.cdc9df...@cvs.netbsd.org>

  | Note normalizing behavior of mktime(3) using language from FreeBSD.

If we are going to start specifying what happens (how the struct tm
is normalised) then we really also need to specify in which order
the fields are corrected.   It makes a difference.

kre



re: CVS commit: src/usr.bin/ldd

2022-10-18 Thread matthew green
> With this change, ldd /lib/libc.so.12.220 fails under NetBSD/amd64 9.99.101.
>
>
> /lib/libc.so.12.220:
> ldd: /lib/libc.so.12.220: invalid ELF class 2, expected 1
>
> It seems that elf32_ldd() fails.
>
> Builds of some pkgsrc packages that use gobject introspection and meson fails
> because they uses ldd command during build.

this should be fixed now.  sorry for the failure..


.mrg.


Re: CVS commit: src/usr.bin/ldd

2022-10-18 Thread Ryo ONODERA
Hi,

With this change, ldd /lib/libc.so.12.220 fails under NetBSD/amd64 9.99.101.

/lib/libc.so.12.220:
ldd: /lib/libc.so.12.220: invalid ELF class 2, expected 1

It seems that elf32_ldd() fails.

Builds of some pkgsrc packages that use gobject introspection and meson fails 
because they uses ldd command during build.

Thank you.

-- 
Ryo ONODERA // r...@tetera.org
PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3

> On Oct 15, 2022, at 14:55, matthew green  wrote:
> 
> Module Name:src
> Committed By:mrg
> Date:Sat Oct 15 05:55:46 UTC 2022
> 
> Modified Files:
>src/usr.bin/ldd: ldd.1 ldd.c
> 
> Log Message:
> ldd(1): add a -v option to display all errors not just the latest.
> 
> ldd on a go binary currently fails with an error that basically
> says "not elf32 class".  this is a true statement, as it is an
> elf64 class object, but it's not useful.  it happens because
> ldd_elf64() is called, fails in _rtld_map_object(), and then
> ldd_elf32() is called, and it fails because the class is wrong,
> and only this error is returned.  (this problem remains.  the
> call to map the object fails due to there being 3 instead of 2
> elf segments in the file.  i guess we need similar code in
> ld.elf_so/map_objects.c as the kernel gained some time ago.)
> 
> perhaps the first error, not the last error, should be used if
> everything fails, but this allows all failures to be see and
> would be useful even if the error string handling changed.
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.20 -r1.21 src/usr.bin/ldd/ldd.1
> cvs rdiff -u -r1.25 -r1.26 src/usr.bin/ldd/ldd.c
> 
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
> 
> Modified files: Index: src/usr.bin/ldd/ldd.1 diff -u 
> src/usr.bin/ldd/ldd.1:1.20 src/usr.bin/ldd/ldd.1:1.21 --- 
> src/usr.bin/ldd/ldd.1:1.20 Mon Dec 25 05:08:49 2017 +++ src/usr.bin/ldd/ldd.1 
> Sat Oct 15 05:55:45 2022 @@ -1,4 +1,4 @@ -.\" $NetBSD: ldd.1,v 1.20 
> 2017/12/25 05:08:49 maya Exp $ +.\" $NetBSD: ldd.1,v 1.21 2022/10/15 05:55:45 
> mrg Exp $ .\" .\" Copyright (c) 1998 The NetBSD Foundation, Inc. .\" All 
> rights reserved. @@ -27,7 +27,7 @@ .\" ARISING IN ANY WAY OUT OF THE USE OF 
> THIS SOFTWARE, EVEN IF ADVISED OF THE .\" POSSIBILITY OF SUCH DAMAGE. .\" 
> -.Dd December 25, 2017 +.Dd October 15, 2022 .Dt LDD 1 .Os .Sh NAME @@ -35,7 
> +35,7 @@ .Nd list dynamic object dependencies .Sh SYNOPSIS .Nm -.Op Fl o +.Op 
> Fl ov .Op Fl f Ar format .Ar program ... .Sh DESCRIPTION @@ -105,6 +105,10 @@ 
> which makes .Nm behave analogously to .Ic nm Fl o . +.Pp +The +.Fl v +option 
> turns on verbose mode. .Sh EXIT STATUS .Ex -std .Sh SEE ALSO @@ -118,9 +122,3 
> @@ A utility first appeared in SunOS 4.0, it appeared in its current form in 
> .Nx 0.9a . -.Sh BUGS -The -a.out -.Nm -actually runs the program it has been 
> requested to analyze which in specially -constructed environments can have 
> security implications. Index: src/usr.bin/ldd/ldd.c diff -u 
> src/usr.bin/ldd/ldd.c:1.25 src/usr.bin/ldd/ldd.c:1.26 --- 
> src/usr.bin/ldd/ldd.c:1.25 Fri Jul 23 04:20:05 2021 +++ src/usr.bin/ldd/ldd.c 
> Sat Oct 15 05:55:45 2022 @@ -1,4 +1,4 @@ -/* $NetBSD: ldd.c,v 1.25 2021/07/23 
> 04:20:05 martin Exp $ */ +/* $NetBSD: ldd.c,v 1.26 2022/10/15 05:55:45 mrg 
> Exp $ */ /*- * Copyright (c) 1998, 2000 The NetBSD Foundation, Inc. @@ -62,7 
> +62,7 @@ #include #ifndef lint -__RCSID("$NetBSD: ldd.c,v 1.25 2021/07/23 
> 04:20:05 martin Exp $"); +__RCSID("$NetBSD: ldd.c,v 1.26 2022/10/15 05:55:45 
> mrg Exp $"); #endif /* not lint */ #include @@ -124,11 +124,12 @@ main(int 
> argc, char **argv) const char *fmt1 = NULL, *fmt2 = NULL; int c, exit_status 
> = EXIT_SUCCESS; char cwd[MAXPATHLEN], path[MAXPATHLEN]; + bool verbose = 
> false, failed = false; #ifdef DEBUG debug = 1; #endif - while ((c = 
> getopt(argc, argv, "f:o")) != -1) { + while ((c = getopt(argc, argv, "f:ov")) 
> != -1) { switch (c) { case 'f': if (fmt1) { @@ -143,6 +144,9 @@ main(int 
> argc, char **argv) errx(1, "Cannot use -o and -f together"); fmt1 = 
> "%a:-l%o.%m => %p\n"; break; + case 'v': + verbose = true; + break; default: 
> usage(); /*NOTREACHED*/ @@ -174,17 +178,31 @@ main(int argc, char **argv) 
> warn("%s", *argv); continue; } - if (elf_ldd(fd, *argv, path, fmt1, fmt2) == 
> -1 - /* Alpha never had 32 bit support. */ + if (elf_ldd(fd, *argv, path, 
> fmt1, fmt2) == -1) { + if (verbose) + warnx("%s", error_message); + failed = 
> true; + } + /* Alpha never had 32 bit support. */ #if (defined(_LP64) && 
> !defined(ELF64_ONLY)) || defined(MIPS_N32) - && elf32_ldd(fd, *argv, path, 
> fmt1, fmt2) == -1 + if (elf32_ldd(fd, *argv, path, fmt1, fmt2) == -1) { + if 
> (verbose) + warnx("%s", error_message); + failed = true; + } #if 
> defined(__mips__) && 0 /* XXX this is still hosed for some reason */ - && 
> elf32_ldd_compat(fd, *argv, path, fmt1, fmt2) == -1 + if 
> (elf32_ldd_compat(fd, *argv, path, fmt1, 

Re: CVS commit: src/sys

2022-10-12 Thread Christos Zoulas
In article <20221011220338.17a48f...@cvs.netbsd.org>,
Andrius Varanavicius  wrote:
>-=-=-=-=-=-
>
>Module Name:   src
>Committed By:  andvar
>Date:  Tue Oct 11 22:03:37 UTC 2022
>
>Modified Files:
>   src/sys/arch/arm/arm32: bus_dma.c
>   src/sys/arch/hppa/hppa: mainbus.c
>   src/sys/arch/vax/vax: bus_dma.c
>   src/sys/dev/pci: if_bge.c vioscsi.c
>
>Log Message:
>fix typos in log messages s/bus_dmamem_create/bus_dmamap_create/ and
>s/bus_dmamem_load/bus_dmamap_load/.
>Inspired by recent similar fixes in OpenBSD.

Yes, but we have __func__ now :-)

christos



Re: CVS commit: src/usr.bin/make

2022-10-10 Thread Robert Elz
Date:Mon, 10 Oct 2022 17:33:35 +
From:"Roland Illig" 
Message-ID:  <20221010173335.c3cccf...@cvs.netbsd.org>

  | Document only the POSIX requirement for now, as I didn't find
  | information about _which_ ancient UNIX systems would corrupt the
  | filesystem on unlinking a directory.

Everthinng before the rename() mkdir() and rmdir() sys calls
were added, which was about 4.2bsd I think - and as I recall
the ability remained for some time after those were added.

Not sure when the UTS stream caught up, probably about the
time when Sun switched SunOS to being SysV based.

  | http://man.cat-v.org/unix-1st/2/sys-unlink (1971) says:
  | > It is also illegal to unlink a directory (except for the super-user).

That has always been true ... the function in make is for when make
is being run as root - not needed otherwise.

kre

ps: I agree that sys calls and emulations thereof should always return
non-zero (and hence true) in the error case (but not necessarily always
0 when successful).  That is simply the way they work and always have.


Re: CVS commit: src/sys/kern

2022-10-04 Thread Robert Elz
Date:Tue, 04 Oct 2022 10:09:35 -0400
From:Christos Zoulas 
Message-ID:  <8dd220d16861eb3a890461bdf02d1...@zoulas.com>

  | I always forget the O_CLOEXEC is special
  | in that regard. I wish it was not, but it is difficult to fix.

POSIX is adding O_CLOFORK in the next version (no guarantees I remembered
the symbol name spelling correctly here) which will have essentially the
same (open time) semantics (similar long term sematics as well, just
applied at a different time).

I assume we will need to add that at some point or other.

  | The question is how to find the vnode?

Not really, I assume that part will be fairly easy (probably trivial),
I just didn't have the energy to go work it out when sending that mail.
We have the file descriptor, and I suspect the file* (need to check to
make sure the right one is immediately available, but we can get it from
the fd if not), we know it refers to a vnode (it came from vn_open()),
so getting the vnode* from the file* is not something difficult, I think.

  | Perhaps it is easiest to fail the open call if O_EXLOCK or
  | O_SHLOCK are specified in a cloning open?

That would be an option, and is better than just ignoring them, but
better still would be to make them work.

Since open_setfp() does nothing (much) when none of the relevant O_xxx
flags that it tests are set (the fd open flags, as distinct from the
fp ones), and the code calls VOP_UNLOCK(vp) after it, we know that vp
is intended to be locked when open_setfp() is called (further confirmed
as when any of the O_??LOCK flags is set, open_setfp() does a VOP_UNLOCK()
and later a vn_lock() (which I am guessing is the inverse).

Maybe all that's needed is a vn_lock() call (on the vp that we still need
to fetch) and then call open_setfp() ?   But this is all beyond what I
know enough about to be sure, particularly to avoid doing anything which
might deadlock, etc.

kre

ps: if this gets done properly, then special case code to handle O_NONBLOCK
(and O_NOSIGPIPE, ...) in cloning device drivers won't be needed either,
open_setfp() is where all of that is normally added to the file* for the
fd being returned, it was not "simply happening" because that call is
missing in the cloned device case.





Re: CVS commit: src/sys/kern

2022-10-04 Thread Christos Zoulas

On 2022-10-01 3:39 pm, Robert Elz wrote:

Date:Sat, 1 Oct 2022 13:00:04 -0400

[stuff deleted]


Even when it is called, the code is:

fp->f_flag = flags & FMASK;

where FMASK is (from fcntl.h)

#define FMASK   (FREAD|FWRITE|FCNTLFLAGS|FEXEC)

and

#define FCNTLFLAGS  
(FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FDSYNC|FRSYNC|FALTIO|\

 FDIRECT|FNOSIGPIPE)

which all looks exactly as it should be to me - and note that O_CLOEXEC
(which has no F equivalent name - it doesn't need one) is not 
there.


So, fp->f_flag isn't set at all (is probably 0), and even if it were
O_CLOEXEC would not be there, and should not be.


Thanks for pointing that out, I always forget the O_CLOEXEC is special
in that regard. I wish it was not, but it is difficult to fix.



For the vnode actually being opened, none of this matters, as the open
lasts as long (actually not as long) as the open() sys call - when that
returns, the device being opened has been closed again, so what the
file that refers to it looks like (or would have looked like) really
doesn't matter at all, it never becomes visible.   That's my guess as
to why the open_setfp() call is missing in that case.

But what got forgotten (or deliberately was not done) was anything to
affect the modes of the clone which is opened.

  | What does it mean when the open specifies O_CLOEXEC
  | and ff->ff_exclose is false? Can that happen? Is that desirable?

It is what is currently happening whenever we open a cloning device.
(Or that is what it looks like to me).   Desirable, no idea, I didn't
define the semantics of cloning device opens, nor which of the open
flags should apply to the clone that is created.

  | I am fine with the locking to stay where it is. I guess it is 
probably

  | not needed after dup/clone, since the underlying vnode is shared...

Assuming you mean dup(2) (and dup2()) and clone(2) then no - those have
no way to pass the relevant locking flags, if you have a fd and want to
apply a lock, fcntl() is what does that (and the fcntl operations that
duplicate fds do not also apply locks).

The only issue is O_EXLOCK and O_SHLOCK (and O_CLOEXEC) for cloned 
devices.


I suspect it makes sense for O_CLOEXEC to be applied in that case, it
makes little sense for open("/dev/ptmx", O_RDWR|O_CLOEXEC) to succeed,
returning an open fd which does not have cloexec set (which is the 
issue,

along with O_NONBLOCK) which started all of this discussion.

The locking flags I am less sure about.   I don't see how they can fail
to succeed if applied, as the vnode for the device has just been 
created,
nothing else can possibly have any kind of lock on it.   Whether 
there's

any benefit in applying the lock so that the node is locked for later,
I don't know - but it certainly should do no harm to do that.

It seems clear to me that what we need is (something like)

Index: vfs_syscalls.c
===
RCS file: /cvsroot/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.555
diff -u -r1.555 vfs_syscalls.c
--- vfs_syscalls.c  12 Feb 2022 15:51:29 -  1.555
+++ vfs_syscalls.c  1 Oct 2022 19:27:15 -
@@ -1763,6 +1763,9 @@
error = fd_dupopen(dupfd, dupfd_move, flags, );
if (error)
return error;
+   error = open_setfp(l, fp, XXXvp, indx, flags);
+   if (error)
+   return error;
*fd = indx;
} else {
error = open_setfp(l, fp, vp, indx, flags);


where XXXvp needs to be extracted from somewhere (it isn't vp, as 
vp==NULL)

except that what follows in the else case is ...

if (error)
return error;
VOP_UNLOCK(vp);
*fd = indx;


That VOP_UNLOCK(vp) is what is bothering me,   It tells me that 
open_setfp()
is expecting to be called with vp locked - but in the first case (the 
cloning
case) there is no VOP_UNLOCK() call (and what's more, fd_dupopen() 
cannot
do it, as it has no vp arg).   That means, I believe, that when 
vn_open()
returns in the normal case, vp is returned locked, but in the cloning 
case

the vnode that was created for the clone is not locked.

I'm not sure what is the right way to find the vnode, or how it should
properly be locked so open_setfp() can do its thing.   If I knew all of
that I would have made an attempt at fixing this already.   We need
someone who really understands what is happening here, and the right
way to handle it all (which very likely is nothing like I just 
suggested).


The question is how to find the vnode? Perhaps it is easiest to fail the
open call if O_EXLOCK or O_SHLOCK are specified in a cloning open? At 
least

we will not silently ignore them?

Best,

christos


Re: CVS commit: src/sys/dev

2022-10-04 Thread Masanobu SAITOH


On 2022/10/04 19:22, Taylor R Campbell wrote:
>> Date: Tue, 4 Oct 2022 17:16:35 +0900
>> From: Masanobu SAITOH 
>>
>> Before reverting changes, one of my machine which use serial console
>> didn't print the Copyright message.
>>
>> Revert two revert commit, i.e.
>> http://mail-index.netbsd.org/source-changes/2022/10/04/msg141389.html
>> http://mail-index.netbsd.org/source-changes/2022/10/04/msg141388.html
>> and applied consokfix.patch.
>> [...]
>> The boot message is not printed and I can see messages after /sbin/init.
>> dmesg(1) shows the kernel messages.
> 
> Can you please try with consokfix.patch _and_ consprintfix.patch?
> 
> consprintfix.patch should restore the kernel messages on the console.

It worked!

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: CVS commit: src/sys/dev

2022-10-04 Thread Taylor R Campbell
> Date: Tue, 4 Oct 2022 17:16:35 +0900
> From: Masanobu SAITOH 
> 
> Before reverting changes, one of my machine which use serial console
> didn't print the Copyright message.
> 
> Revert two revert commit, i.e.
> http://mail-index.netbsd.org/source-changes/2022/10/04/msg141389.html
> http://mail-index.netbsd.org/source-changes/2022/10/04/msg141388.html
> and applied consokfix.patch.
> [...]
> The boot message is not printed and I can see messages after /sbin/init.
> dmesg(1) shows the kernel messages.

Can you please try with consokfix.patch _and_ consprintfix.patch?

consprintfix.patch should restore the kernel messages on the console.
>From 0f058a0e89e3f545a9020a2fb79dadd7ad89029a Mon Sep 17 00:00:00 2001
From: Taylor R Campbell 
Date: Tue, 4 Oct 2022 05:48:39 +
Subject: [PATCH] squash! constty(4): Make MP-safe.

- Fix reversed sense of conditional.
---
 sys/kern/subr_prf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sys/kern/subr_prf.c b/sys/kern/subr_prf.c
index e87e6efc8501..53fb20c1d393 100644
--- a/sys/kern/subr_prf.c
+++ b/sys/kern/subr_prf.c
@@ -425,7 +425,7 @@ putone(int c, int flags, struct tty *tp)
if ((flags & TOLOG) &&
c != '\0' && c != '\r' && c != 0177)
logputchar(c);
-   if ((flags & TOCONS) && ctp != NULL && c != '\0')
+   if ((flags & TOCONS) && ctp == NULL && c != '\0')
(*v_putc)(c);
 
pserialize_read_exit(s);


Re: CVS commit: src/sys/dev

2022-10-04 Thread Masanobu SAITOH
Hi.

On 2022/10/04 16:24, Ryo ONODERA wrote:
> Hi,
> 
> Taylor R Campbell  writes:
> 
>>> Date: Tue, 04 Oct 2022 15:53:58 +0900
>>> From: Ryo ONODERA 
>>>
>>> With this patch, it works fine for me.
>>> There is no stall after genfb(4).
>>> And I do not find any other problem so far.
>>
>> Thanks!  There probably is another problem which is that kernel
>> console printfs might stop appearing after a certain point, which the
>> following patch might fix too -- I inadvertently reversed the sense
>> of a conditional in the subr_prf.c changes.
> 
> I have not encountered another problem yet. However with your
> consprintfix.patch, it works fine for me too.

Before reverting changes, one of my machine which use serial console
didn't print the Copyright message.

Revert two revert commit, i.e.
http://mail-index.netbsd.org/source-changes/2022/10/04/msg141389.html
http://mail-index.netbsd.org/source-changes/2022/10/04/msg141388.html
and applied consokfix.patch.

> NetBSD MBR boot
> 
> NetBSD/x86 ffsv2 Primary Bootstrap
> 
> 0 seconds.
> booting hd0a:netbsd (howto 0xa)
> 72567816+35522344+2226392 [945568+1555416+1162803]=0x6d7f2a8
> Loading /var/db/entropy-file
> Tue Oct  4 17:07:44 JST 2022
> Starting root file system check:
> /dev/rwd0a: file system is clean; not checking
> Setting sysctl variables:
...

The boot message is not printed and I can see messages after /sbin/init.
dmesg(1) shows the kernel messages.

 Regards.


> Thank you.
> 
>> From 0f058a0e89e3f545a9020a2fb79dadd7ad89029a Mon Sep 17 00:00:00 2001
>> From: Taylor R Campbell 
>> Date: Tue, 4 Oct 2022 05:48:39 +
>> Subject: [PATCH] squash! constty(4): Make MP-safe.
>>
>> - Fix reversed sense of conditional.
>> ---
>>  sys/kern/subr_prf.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/sys/kern/subr_prf.c b/sys/kern/subr_prf.c
>> index e87e6efc8501..53fb20c1d393 100644
>> --- a/sys/kern/subr_prf.c
>> +++ b/sys/kern/subr_prf.c
>> @@ -425,7 +425,7 @@ putone(int c, int flags, struct tty *tp)
>>  if ((flags & TOLOG) &&
>>  c != '\0' && c != '\r' && c != 0177)
>>  logputchar(c);
>> -if ((flags & TOCONS) && ctp != NULL && c != '\0')
>> +if ((flags & TOCONS) && ctp == NULL && c != '\0')
>>  (*v_putc)(c);
>>  
>>  pserialize_read_exit(s);
> 

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: CVS commit: src/sys/arch

2022-10-04 Thread Rin Okuyama

On 2022/10/04 16:38, matthew green wrote:

"Rin Okuyama" writes:

Module Name:src
Committed By:   rin
Date:   Tue Oct  4 07:24:32 UTC 2022

Modified Files:
src/sys/arch/amiga/dev: ser.c
src/sys/arch/sgimips/dev: scn.c

Log Message:
Remove unused extern declaration of constty.


thank you.


Welcome!


can someone please find all the "extern ;"
in .c files, and move them all into .h files.


Well, it should be a tough work ;)

% cd /usr/src/sys && find . -name '*.c' | grep -v /external/ | xargs grep 
'extern[  ]' | wc -l
4170

Thanks,
rin


re: CVS commit: src/sys/arch

2022-10-04 Thread matthew green
"Rin Okuyama" writes:
> Module Name:  src
> Committed By: rin
> Date: Tue Oct  4 07:24:32 UTC 2022
>
> Modified Files:
>   src/sys/arch/amiga/dev: ser.c
>   src/sys/arch/sgimips/dev: scn.c
>
> Log Message:
> Remove unused extern declaration of constty.

thank you.

can someone please find all the "extern ;"
in .c files, and move them all into .h files.

please.


.mrg.


Re: CVS commit: src/sys/dev

2022-10-04 Thread Ryo ONODERA
Hi,

Taylor R Campbell  writes:

>> Date: Tue, 04 Oct 2022 15:53:58 +0900
>> From: Ryo ONODERA 
>> 
>> With this patch, it works fine for me.
>> There is no stall after genfb(4).
>> And I do not find any other problem so far.
>
> Thanks!  There probably is another problem which is that kernel
> console printfs might stop appearing after a certain point, which the
> following patch might fix too -- I inadvertently reversed the sense
> of a conditional in the subr_prf.c changes.

I have not encountered another problem yet. However with your
consprintfix.patch, it works fine for me too.

Thank you.

> From 0f058a0e89e3f545a9020a2fb79dadd7ad89029a Mon Sep 17 00:00:00 2001
> From: Taylor R Campbell 
> Date: Tue, 4 Oct 2022 05:48:39 +
> Subject: [PATCH] squash! constty(4): Make MP-safe.
>
> - Fix reversed sense of conditional.
> ---
>  sys/kern/subr_prf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sys/kern/subr_prf.c b/sys/kern/subr_prf.c
> index e87e6efc8501..53fb20c1d393 100644
> --- a/sys/kern/subr_prf.c
> +++ b/sys/kern/subr_prf.c
> @@ -425,7 +425,7 @@ putone(int c, int flags, struct tty *tp)
>   if ((flags & TOLOG) &&
>   c != '\0' && c != '\r' && c != 0177)
>   logputchar(c);
> - if ((flags & TOCONS) && ctp != NULL && c != '\0')
> + if ((flags & TOCONS) && ctp == NULL && c != '\0')
>   (*v_putc)(c);
>  
>   pserialize_read_exit(s);

-- 
Ryo ONODERA // r...@tetera.org
PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3


Re: CVS commit: src/sys/dev

2022-10-04 Thread Taylor R Campbell
> Date: Tue, 04 Oct 2022 15:53:58 +0900
> From: Ryo ONODERA 
> 
> With this patch, it works fine for me.
> There is no stall after genfb(4).
> And I do not find any other problem so far.

Thanks!  There probably is another problem which is that kernel
console printfs might stop appearing after a certain point, which the
following patch might fix too -- I inadvertently reversed the sense
of a conditional in the subr_prf.c changes.
>From 0f058a0e89e3f545a9020a2fb79dadd7ad89029a Mon Sep 17 00:00:00 2001
From: Taylor R Campbell 
Date: Tue, 4 Oct 2022 05:48:39 +
Subject: [PATCH] squash! constty(4): Make MP-safe.

- Fix reversed sense of conditional.
---
 sys/kern/subr_prf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sys/kern/subr_prf.c b/sys/kern/subr_prf.c
index e87e6efc8501..53fb20c1d393 100644
--- a/sys/kern/subr_prf.c
+++ b/sys/kern/subr_prf.c
@@ -425,7 +425,7 @@ putone(int c, int flags, struct tty *tp)
if ((flags & TOLOG) &&
c != '\0' && c != '\r' && c != 0177)
logputchar(c);
-   if ((flags & TOCONS) && ctp != NULL && c != '\0')
+   if ((flags & TOCONS) && ctp == NULL && c != '\0')
(*v_putc)(c);
 
pserialize_read_exit(s);


Re: CVS commit: src/sys/dev

2022-10-04 Thread Ryo ONODERA
Hi,

Taylor R Campbell  writes:

>> Date: Tue, 04 Oct 2022 12:12:15 +0900
>> From: Ryo ONODERA 
>> 
>> "Taylor R Campbell"  writes:
>> 
>> > console(4), constty(4): Rip off the kernel lock.
>> 
>> After introduction of MP-safe console/constty, my kernel stopped
>> just after genfb(4) detection.
>> LOCKDEBUG, DIAGNOSTIC, DEBUG options does not provide any additional
>> information with me.
>> Could you take a look at my problem?
>
> Sorry about that -- I've reverted this change and the MP-safe cons(4)
> change for now, but let's try to figure out what's wrong with them so
> I can reapply them and get the console paths out of the kernel lock
> for good.

No problem. And thanks for your quick response.

> Can you try the attached patch on top?

With this patch, it works fine for me.
There is no stall after genfb(4).
And I do not find any other problem so far.

$ cd /usr/src
$ TZ=UTC cvs up -dP -D2022-10-04
$ patch -p1 < ~/consokfix.patch
$ ./build.sh ... kernel=...

Thank you very much!!!

> From 2de03f1efbe5b73d42dc2f59730c17b99c04b3b9 Mon Sep 17 00:00:00 2001
> From: Taylor R Campbell 
> Date: Tue, 4 Oct 2022 05:24:49 +
> Subject: [PATCH] squash! constty(4): Make MP-safe.
>
> - Fix initialization of ok in cn_redirect.
> ---
>  sys/dev/cons.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/sys/dev/cons.c b/sys/dev/cons.c
> index f4f9a1602221..e621292a6b4a 100644
> --- a/sys/dev/cons.c
> +++ b/sys/dev/cons.c
> @@ -463,7 +463,7 @@ cn_redirect(dev_t *devp, int is_read, int *error, struct 
> tty **ctpp)
>   dev_t dev = *devp;
>   struct tty *ctp;
>   int s;
> - bool ok;
> + bool ok = false;
>  
>   *error = ENXIO;
>   *ctpp = NULL;
> @@ -472,18 +472,17 @@ cn_redirect(dev_t *devp, int is_read, int *error, 
> struct tty **ctpp)
>   (cn_tab == NULL || (cn_tab->cn_pri != CN_REMOTE))) {
>   if (is_read) {
>   *error = 0;
> - ok = false;
>   goto out;
>   }
>   tty_acquire(ctp);
>   *ctpp = ctp;
>   dev = ctp->t_dev;
>   } else if (cn_tab == NULL) {
> - ok = false;
>   goto out;
>   } else {
>   dev = cn_tab->cn_dev;
>   }
> + ok = true;
>   *devp = dev;
>  out: pserialize_read_exit(s);
>   return ok;

-- 
Ryo ONODERA // r...@tetera.org
PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3


Re: CVS commit: src/sys/dev

2022-10-03 Thread Taylor R Campbell
> Date: Tue, 04 Oct 2022 12:12:15 +0900
> From: Ryo ONODERA 
> 
> "Taylor R Campbell"  writes:
> 
> > console(4), constty(4): Rip off the kernel lock.
> 
> After introduction of MP-safe console/constty, my kernel stopped
> just after genfb(4) detection.
> LOCKDEBUG, DIAGNOSTIC, DEBUG options does not provide any additional
> information with me.
> Could you take a look at my problem?

Sorry about that -- I've reverted this change and the MP-safe cons(4)
change for now, but let's try to figure out what's wrong with them so
I can reapply them and get the console paths out of the kernel lock
for good.

Can you try the attached patch on top?
>From 2de03f1efbe5b73d42dc2f59730c17b99c04b3b9 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell 
Date: Tue, 4 Oct 2022 05:24:49 +
Subject: [PATCH] squash! constty(4): Make MP-safe.

- Fix initialization of ok in cn_redirect.
---
 sys/dev/cons.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/sys/dev/cons.c b/sys/dev/cons.c
index f4f9a1602221..e621292a6b4a 100644
--- a/sys/dev/cons.c
+++ b/sys/dev/cons.c
@@ -463,7 +463,7 @@ cn_redirect(dev_t *devp, int is_read, int *error, struct 
tty **ctpp)
dev_t dev = *devp;
struct tty *ctp;
int s;
-   bool ok;
+   bool ok = false;
 
*error = ENXIO;
*ctpp = NULL;
@@ -472,18 +472,17 @@ cn_redirect(dev_t *devp, int is_read, int *error, struct 
tty **ctpp)
(cn_tab == NULL || (cn_tab->cn_pri != CN_REMOTE))) {
if (is_read) {
*error = 0;
-   ok = false;
goto out;
}
tty_acquire(ctp);
*ctpp = ctp;
dev = ctp->t_dev;
} else if (cn_tab == NULL) {
-   ok = false;
goto out;
} else {
dev = cn_tab->cn_dev;
}
+   ok = true;
*devp = dev;
 out:   pserialize_read_exit(s);
return ok;


Re: CVS commit: src/sys/dev

2022-10-03 Thread Ryo ONODERA
Hi,

"Taylor R Campbell"  writes:

> Module Name:  src
> Committed By: riastradh
> Date: Mon Oct  3 19:57:25 UTC 2022
>
> Modified Files:
>   src/sys/dev: cons.c
>
> Log Message:
> console(4), constty(4): Rip off the kernel lock.

After introduction of MP-safe console/constty, my kernel stopped
just after genfb(4) detection.
LOCKDEBUG, DIAGNOSTIC, DEBUG options does not provide any additional
information with me.
Could you take a look at my problem?

My dmesg just before MP-safe console.constty is here:

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
2018, 2019, 2020, 2021, 2022
The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 9.99.100 (DTRACE8) #16: Tue Oct  4 09:36:04 JST 2022
ryoon@brownie:/usr/world/9.99/amd64/obj/sys/arch/amd64/compile/DTRACE8
total memory = 15680 MB
avail memory = 15137 MB
timecounter: Timecounters tick every 10.000 msec
Kernelized RAIDframe activated
pms* disabled
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
efi: systbl at pa c9f7e018
mainbus0 (root)
ACPI: RSDP 0xCDFFE014 24 (v02 HPQOEM)
ACPI: XSDT 0xCDFA4228 000174 (v01 HPQOEM SLIC-MPC 0002 HP   
0113)
ACPI: FACP 0xCDFE 00010C (v05 HPQOEM SLIC-MPC 0002 HP   
0004)
ACPI: DSDT 0xCDFD2000 009607 (v01 HPQOEM 8929 0002 HP   
0004)
ACPI: FACS 0xCDEB3000 40
ACPI: UEFI 0xCDF7E000 000236 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFFC000 00020D (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFF4000 0072B0 (v02 HPQOEM 8929 0002 HP   
0004)
ACPI: IVRS 0xCDFF3000 0001A4 (v02 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFEF000 003A21 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFEE000 000472 (v02 HPQOEM 8929 1000 HP   
0004)
ACPI: TPM2 0xCDFED000 4C (v04 HPQOEM 8929 0002 HP   
0004)
ACPI: SSDT 0xCDFEC000 00017D (v01 HPQOEM 8929 1000 HP   
0004)
ACPI: SSDT 0xCDFE4000 007B3B (v01 HPQOEM 8929 1000 HP   
0004)
ACPI: MSDM 0xCDFE3000 55 (v03 HPQOEM SLIC-MPC 0001 HP   
0004)
ACPI: ASF! 0xCDFE2000 A5 (v32 HPQOEM 8929 0002 HP   
0004)
ACPI: BOOT 0xCDFE1000 28 (v01 HPQOEM 8929 0002 HP   
0004)
ACPI: HPET 0xCDFDF000 38 (v01 HPQOEM 8929 0002 HP   
0004)
ACPI: APIC 0xCDFDE000 000138 (v03 HPQOEM 8929 0002 HP   
0004)
ACPI: MCFG 0xCDFDD000 3C (v01 HPQOEM 8929 0002 HP   
0004)
ACPI: SSDT 0xCDFD1000 C1 (v01 HPQOEM 8929 0002 HP   
0004)
ACPI: SSDT 0xCDFD 80 (v01 HPQOEM 8929 0002 HP   
0004)
ACPI: VFCT 0xCDFC2000 00D884 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFFD000 F8 (v01 HPQOEM 8929 1000 HP   
0004)
ACPI: SSDT 0xCDFC 5C (v02 HPQOEM 8929 1000 HP   
0004)
ACPI: SSDT 0xCDFB9000 005354 (v02 HPQOEM 8929 0001 HP   
0004)
ACPI: CRAT 0xCDFB8000 000EE8 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: CDIT 0xCDFB7000 29 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFB6000 000139 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFB5000 00028D (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFB4000 000372 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFB3000 00021F (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFB2000 000D53 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFB 0010C5 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFAC000 00362F (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: FPDT 0xCDFAB000 44 (v01 HPQOEM SLIC-MPC 0002 HP   
0004)
ACPI: WSMT 0xCDFA9000 28 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFA8000 42 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFA7000 00020A (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFA6000 0005AD (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFA5000 0002E9 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFC1000 7D (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFA3000 C7 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: SSDT 0xCDFA2000 0004DB (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: BGRT 0xCDFDC000 38 (v01 HPQOEM 8929 0001 HP   
0004)
ACPI: 26 ACPI AML tables successfully acquired and loaded
ioapic0 at mainbus0 

Re: CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs

2022-10-03 Thread J. Hannken-Illjes
Yes -- I suppose genfs_pathconf() is sufficient here.

--
J. Hannken-Illjes - hann...@mailbox.org

> On 3. Oct 2022, at 11:40, Frank Kardel  wrote:
> 
> 
> Good to know. So do we need to add path conf there?
> Frank
> 
> 3 Oct 2022 11:34:09 J. Hannken-Illjes :
> 
>> Frank,
>> 
>> the vnode operations for ".zfs" are in zfs_ctldir.c and there
>> is no pathconf operation here ...
>> 
>> --
>> J. Hannken-Illjes - hann...@mailbox.org
>> 
>>> On 3. Oct 2022, at 11:26, Frank Kardel  wrote:
>>> 
>>> Well, I am am happy to do that. But ls got eopnotsupp for fpathconf of ACL 
>>> names and barked at them when listing the .zfs snapshot directory. Ls is 
>>> only ignoring einval at that place. So something must be wrong there wrt 
>>> tog.
>>> Frank
>>> 
>>> 3 Oct 2022 10:53:10 J. Hannken-Illjes :
>>> 
> On 27. Sep 2022, at 12:33, Frank Kardel  wrote:
> 
> Module Name:src
> Committed By:   kardel
> Date:   Tue Sep 27 10:33:21 UTC 2022
> 
> Modified Files:
> src/external/cddl/osnet/dist/uts/common/fs/zfs: zfs_vnops.c
> 
> Log Message:
> for unsupported names return EINVAL as per TOG
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/fpathconf.html
> discussed with christos@
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.78 -r1.79 \
>src/external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vnops.c
 
 This is completely wrong!
 
 The call sequence is "VOP_PATHCONF -> zfs_netbsd_pathconf -> zfs_pathconf"
 where zfs_netbsd_pathconf() handles the NetBSD specific names if
 zfs_pathconf() returns EOPNOTSUPP and also returns EINVAL for
 unsupported names in this case.
 
 Please revert.
 
 --
 J. Hannken-Illjes - hann...@mailbox.org



signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs

2022-10-03 Thread Frank Kardel


Good to know. So do we need to add path conf there?
Frank

3 Oct 2022 11:34:09 J. Hannken-Illjes :

> Frank,
> 
> the vnode operations for ".zfs" are in zfs_ctldir.c and there
> is no pathconf operation here ...
> 
> --
> J. Hannken-Illjes - hann...@mailbox.org
> 
>> On 3. Oct 2022, at 11:26, Frank Kardel  wrote:
>> 
>> Well, I am am happy to do that. But ls got eopnotsupp for fpathconf of ACL 
>> names and barked at them when listing the .zfs snapshot directory. Ls is 
>> only ignoring einval at that place. So something must be wrong there wrt tog.
>> Frank
>> 
>> 3 Oct 2022 10:53:10 J. Hannken-Illjes :
>> 
 On 27. Sep 2022, at 12:33, Frank Kardel  wrote:
 
 Module Name:    src
 Committed By:   kardel
 Date:   Tue Sep 27 10:33:21 UTC 2022
 
 Modified Files:
     src/external/cddl/osnet/dist/uts/common/fs/zfs: zfs_vnops.c
 
 Log Message:
 for unsupported names return EINVAL as per TOG
 https://pubs.opengroup.org/onlinepubs/9699919799/functions/fpathconf.html
 discussed with christos@
 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.78 -r1.79 \
    src/external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vnops.c
>>> 
>>> This is completely wrong!
>>> 
>>> The call sequence is "VOP_PATHCONF -> zfs_netbsd_pathconf -> zfs_pathconf"
>>> where zfs_netbsd_pathconf() handles the NetBSD specific names if
>>> zfs_pathconf() returns EOPNOTSUPP and also returns EINVAL for
>>> unsupported names in this case.
>>> 
>>> Please revert.
>>> 
>>> --
>>> J. Hannken-Illjes - hann...@mailbox.org


Re: CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs

2022-10-03 Thread J. Hannken-Illjes
Frank,

the vnode operations for ".zfs" are in zfs_ctldir.c and there
is no pathconf operation here ...

--
J. Hannken-Illjes - hann...@mailbox.org

> On 3. Oct 2022, at 11:26, Frank Kardel  wrote:
> 
> Well, I am am happy to do that. But ls got eopnotsupp for fpathconf of ACL 
> names and barked at them when listing the .zfs snapshot directory. Ls is only 
> ignoring einval at that place. So something must be wrong there wrt tog.
> Frank
> 
> 3 Oct 2022 10:53:10 J. Hannken-Illjes :
> 
>>> On 27. Sep 2022, at 12:33, Frank Kardel  wrote:
>>> 
>>> Module Name:src
>>> Committed By:   kardel
>>> Date:   Tue Sep 27 10:33:21 UTC 2022
>>> 
>>> Modified Files:
>>> src/external/cddl/osnet/dist/uts/common/fs/zfs: zfs_vnops.c
>>> 
>>> Log Message:
>>> for unsupported names return EINVAL as per TOG
>>> https://pubs.opengroup.org/onlinepubs/9699919799/functions/fpathconf.html
>>> discussed with christos@
>>> 
>>> 
>>> To generate a diff of this commit:
>>> cvs rdiff -u -r1.78 -r1.79 \
>>>src/external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vnops.c
>> 
>> This is completely wrong!
>> 
>> The call sequence is "VOP_PATHCONF -> zfs_netbsd_pathconf -> zfs_pathconf"
>> where zfs_netbsd_pathconf() handles the NetBSD specific names if
>> zfs_pathconf() returns EOPNOTSUPP and also returns EINVAL for
>> unsupported names in this case.
>> 
>> Please revert.
>> 
>> --
>> J. Hannken-Illjes - hann...@mailbox.org



signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs

2022-10-03 Thread Frank Kardel
Well, I am am happy to do that. But ls got eopnotsupp for fpathconf of ACL 
names and barked at them when listing the .zfs snapshot directory. Ls is only 
ignoring einval at that place. So something must be wrong there wrt tog.
Frank

3 Oct 2022 10:53:10 J. Hannken-Illjes :

>> On 27. Sep 2022, at 12:33, Frank Kardel  wrote:
>> 
>> Module Name:    src
>> Committed By:   kardel
>> Date:   Tue Sep 27 10:33:21 UTC 2022
>> 
>> Modified Files:
>>     src/external/cddl/osnet/dist/uts/common/fs/zfs: zfs_vnops.c
>> 
>> Log Message:
>> for unsupported names return EINVAL as per TOG
>> https://pubs.opengroup.org/onlinepubs/9699919799/functions/fpathconf.html
>> discussed with christos@
>> 
>> 
>> To generate a diff of this commit:
>> cvs rdiff -u -r1.78 -r1.79 \
>>    src/external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vnops.c
> 
> This is completely wrong!
> 
> The call sequence is "VOP_PATHCONF -> zfs_netbsd_pathconf -> zfs_pathconf"
> where zfs_netbsd_pathconf() handles the NetBSD specific names if
> zfs_pathconf() returns EOPNOTSUPP and also returns EINVAL for
> unsupported names in this case.
> 
> Please revert.
> 
> --
> J. Hannken-Illjes - hann...@mailbox.org


Re: CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs

2022-10-03 Thread J. Hannken-Illjes
> On 27. Sep 2022, at 12:33, Frank Kardel  wrote:
> 
> Module Name:  src
> Committed By: kardel
> Date: Tue Sep 27 10:33:21 UTC 2022
> 
> Modified Files:
>   src/external/cddl/osnet/dist/uts/common/fs/zfs: zfs_vnops.c
> 
> Log Message:
> for unsupported names return EINVAL as per TOG
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/fpathconf.html
> discussed with christos@
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.78 -r1.79 \
>src/external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vnops.c

This is completely wrong!

The call sequence is "VOP_PATHCONF -> zfs_netbsd_pathconf -> zfs_pathconf"
where zfs_netbsd_pathconf() handles the NetBSD specific names if
zfs_pathconf() returns EOPNOTSUPP and also returns EINVAL for
unsupported names in this case.

Please revert.

--
J. Hannken-Illjes - hann...@mailbox.org


signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/sys/kern

2022-10-01 Thread Robert Elz
Date:Sat, 1 Oct 2022 13:00:04 -0400
From:Christos Zoulas 
Message-ID:  <8bd3a408-77fd-402c-8d6c-ad4b4a5e8...@zoulas.com>


  | This is what I meant:
  |
  | RCS file: /cvsroot/src/sys/kern/kern_descrip.c,v
  | retrieving revision 1.251
  | diff -u -u -r1.251 kern_descrip.c
  | --- kern_descrip.c  29 Jun 2021 22:40:53 -  1.251
  | +++ kern_descrip.c  1 Oct 2022 16:56:44 -
  | @@ -1162,6 +1162,7 @@
  | KASSERT(ff->ff_allocated);
  | KASSERT(fd_isused(fdp, fd));
  | KASSERT(fd >= NDFDFILE || ff == (fdfile_t *)fdp->fd_dfdfile[fd]);
  | +   ff->ff_exclose = (fp->f_flag & O_CLOEXEC) != 0;
  |
  | /* No need to lock in order to make file initially visible. */
  | ff->ff_file = fp;

That's not going to work in the situation we're in, as nothing will have
set O_CLOEXEC in fp->f_flag (or shouldn't have anyway, right, O_CLOEXEC isn't
that kind of a flag, it belongs to the descriptor, not the file*).

f_flag is set in vfs_syscalls.c in open_setfp() - which is never called
in the cloning device case, that's the underlying problem I believe.

Even when it is called, the code is:

fp->f_flag = flags & FMASK;

where FMASK is (from fcntl.h)

#define FMASK   (FREAD|FWRITE|FCNTLFLAGS|FEXEC)

and

#define FCNTLFLAGS  (FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FDSYNC|FRSYNC|FALTIO|\
 FDIRECT|FNOSIGPIPE)

which all looks exactly as it should be to me - and note that O_CLOEXEC
(which has no F equivalent name - it doesn't need one) is not there.

So, fp->f_flag isn't set at all (is probably 0), and even if it were
O_CLOEXEC would not be there, and should not be.

For the vnode actually being opened, none of this matters, as the open
lasts as long (actually not as long) as the open() sys call - when that
returns, the device being opened has been closed again, so what the
file that refers to it looks like (or would have looked like) really
doesn't matter at all, it never becomes visible.   That's my guess as
to why the open_setfp() call is missing in that case.

But what got forgotten (or deliberately was not done) was anything to
affect the modes of the clone which is opened.

  | What does it mean when the open specifies O_CLOEXEC
  | and ff->ff_exclose is false? Can that happen? Is that desirable?

It is what is currently happening whenever we open a cloning device.
(Or that is what it looks like to me).   Desirable, no idea, I didn't
define the semantics of cloning device opens, nor which of the open
flags should apply to the clone that is created.

  | I am fine with the locking to stay where it is. I guess it is probably
  | not needed after dup/clone, since the underlying vnode is shared...

Assuming you mean dup(2) (and dup2()) and clone(2) then no - those have
no way to pass the relevant locking flags, if you have a fd and want to
apply a lock, fcntl() is what does that (and the fcntl operations that
duplicate fds do not also apply locks).

The only issue is O_EXLOCK and O_SHLOCK (and O_CLOEXEC) for cloned devices.

I suspect it makes sense for O_CLOEXEC to be applied in that case, it
makes little sense for open("/dev/ptmx", O_RDWR|O_CLOEXEC) to succeed,
returning an open fd which does not have cloexec set (which is the issue,
along with O_NONBLOCK) which started all of this discussion.

The locking flags I am less sure about.   I don't see how they can fail
to succeed if applied, as the vnode for the device has just been created,
nothing else can possibly have any kind of lock on it.   Whether there's
any benefit in applying the lock so that the node is locked for later,
I don't know - but it certainly should do no harm to do that.

It seems clear to me that what we need is (something like)

Index: vfs_syscalls.c
===
RCS file: /cvsroot/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.555
diff -u -r1.555 vfs_syscalls.c
--- vfs_syscalls.c  12 Feb 2022 15:51:29 -  1.555
+++ vfs_syscalls.c  1 Oct 2022 19:27:15 -
@@ -1763,6 +1763,9 @@
error = fd_dupopen(dupfd, dupfd_move, flags, );
if (error)
return error;
+   error = open_setfp(l, fp, XXXvp, indx, flags);
+   if (error)
+   return error;
*fd = indx;
} else {
error = open_setfp(l, fp, vp, indx, flags);


where XXXvp needs to be extracted from somewhere (it isn't vp, as vp==NULL)
except that what follows in the else case is ...

if (error)
return error;
VOP_UNLOCK(vp);
*fd = indx;


That VOP_UNLOCK(vp) is what is bothering me,   It tells me that open_setfp()
is expecting to be called with vp locked - but in the first case (the cloning
case) there is no VOP_UNLOCK() call (and what's more, fd_dupopen() cannot
do it, as it has no vp arg).   That means, I 

Re: CVS commit: src/sys/kern

2022-10-01 Thread Christos Zoulas


> On Sep 30, 2022, at 11:02 PM, Robert Elz  wrote:
> 
>Date:Fri, 30 Sep 2022 20:15:07 -0400
>From:Christos Zoulas 
>Message-ID:  
> 
>  | It does not need an extra flag (it looks in the file descriptor flags to
>  | find if it needs to set or not.
> 
> One of us is confused.   From where in this case does anything
> get the exclose flag set?   That's the whole question here.  The
> flags arg that is passed around has O_CLOEXEC set in it - you used
> that in the call to fd_set_exclose() in kern/tty_ptm.c ... but where
> you said that would be better done in fd_affix().

This is what I meant:

RCS file: /cvsroot/src/sys/kern/kern_descrip.c,v
retrieving revision 1.251
diff -u -u -r1.251 kern_descrip.c
--- kern_descrip.c  29 Jun 2021 22:40:53 -  1.251
+++ kern_descrip.c  1 Oct 2022 16:56:44 -
@@ -1162,6 +1162,7 @@
KASSERT(ff->ff_allocated);
KASSERT(fd_isused(fdp, fd));
KASSERT(fd >= NDFDFILE || ff == (fdfile_t *)fdp->fd_dfdfile[fd]);
+   ff->ff_exclose = (fp->f_flag & O_CLOEXEC) != 0;

/* No need to lock in order to make file initially visible. */
ff->ff_file = fp;

> 
> That does not have access to the flags.   So from where is it going
> to get the close on exec info ?
> 
> My reading of do_open() is that the O_CLOEXEC flag is never even
> examined when a cloning device is opened, it doesn't get set on
> the original fd (the cloner) or the cloned device (other than by
> your recent modification for /dev/pmx).
> 
> Did I misread the code?
> 
> Or are you planning something different than it seemed?
> 
>  | to find other cases where we forgot to call fd_set_exclose() before calling
>  | fd_affix().
> 
> My point is that it should not be necessary to call fd_set_exclose()
> in every (or any) cloning device driver.  The open syscall handling
> is where that should be done, just as it is for all the opens that
> are not cloning devices.

What does it mean when the open specifies O_CLOEXEC
and ff->ff_exclose is false? Can that happen? Is that desirable?

> Why be different?
> 
>  | It also does not need locking because the process can't access
>  | the descriptor before calling fd_affix.
> 
> The locking I was referring to are the vnode locks/references in
> do_open(), not anything related to the file struct or descriptor.
> I just do not feel competent to get all of that correct in this
> case (more complex than the normal case because of the extra vnode
> involved) and would prefer if someone familiar with all of that
> were to handle it - particularly in the extra error case that will
> need to be handled, even if I cannot see how it would actually fire
> in the case in question.

I am fine with the locking to stay where it is. I guess it is probably
not needed after dup/clone, since the underlying vnode is shared...

christos


signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/sys/kern

2022-09-30 Thread Robert Elz
Date:Fri, 30 Sep 2022 20:15:07 -0400
From:Christos Zoulas 
Message-ID:  

  | It does not need an extra flag (it looks in the file descriptor flags to
  | find if it needs to set or not.

One of us is confused.   From where in this case does anything
get the exclose flag set?   That's the whole question here.  The
flags arg that is passed around has O_CLOEXEC set in it - you used
that in the call to fd_set_exclose() in kern/tty_ptm.c ... but where
you said that would be better done in fd_affix().

That does not have access to the flags.   So from where is it going
to get the close on exec info ?

My reading of do_open() is that the O_CLOEXEC flag is never even
examined when a cloning device is opened, it doesn't get set on
the original fd (the cloner) or the cloned device (other than by
your recent modification for /dev/pmx).

Did I misread the code?

Or are you planning something different than it seemed?

  | to find other cases where we forgot to call fd_set_exclose() before calling
  | fd_affix().

My point is that it should not be necessary to call fd_set_exclose()
in every (or any) cloning device driver.  The open syscall handling
is where that should be done, just as it is for all the opens that
are not cloning devices.

Why be different?

  | It also does not need locking because the process can't access
  | the descriptor before calling fd_affix.

The locking I was referring to are the vnode locks/references in
do_open(), not anything related to the file struct or descriptor.
I just do not feel competent to get all of that correct in this
case (more complex than the normal case because of the extra vnode
involved) and would prefer if someone familiar with all of that
were to handle it - particularly in the extra error case that will
need to be handled, even if I cannot see how it would actually fire
in the case in question.

kre


Re: CVS commit: src/sys/kern

2022-09-30 Thread Christos Zoulas


> On Sep 30, 2022, at 5:57 PM, Robert Elz  wrote:
> 
>Date:Fri, 30 Sep 2022 16:34:20 -0400
>From:Christos Zoulas 
>Message-ID:  <232331ad-d501-4547-b730-03590c0c9...@zoulas.com>
> 
>  | How about handling exclose there?
> 
> That would be possible, but why?   We still need higher level code to
> handle the locking, which can also handle cloexec -- the problem we
> have now is simply that the relevant call is missing, I don't think adding
> it will be hard, but it needs to be done by someone who understands the
> locking requirements, and correct exit strategy in this case if an error
> occurs (failing to successfully lock a newly created clone would seem to
> be a very bizarre case, but still...)   That is, I don't feel competent to
> suggest the 3 or 4 lines that ought be added in do_open to fix this (for
> just O_CLOEXEC it would be trivial there, as that cannot fail).
> 
> Currently fd_affix (I mistakenly made it fp_affix in the last message...)
> doesn't have a flags parameter, so to do it the way you suggest, we'd need
> to alter its signature, bump to 9.99.101 (and I haven't yet gotten around
> to making my kernel be 98.99.100 which I'm kind of planning to do ...)
> and go alter all the calls everywhere, mostly just filling in an extra
> arg with a 0.

It does not need an extra flag (it looks in the file descriptor flags to
find if it needs to set or not. In fact the first thing I thought was to add
an assertion to make sure that the flags agrees with that is set in exclose,
to find other cases where we forgot to call fd_set_exclose() before calling
fd_affix(). It also does not need locking because the process can't access
the descriptor before calling fd_affix.

christos


signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/sys/kern

2022-09-30 Thread Paul Goyette

On Sat, 1 Oct 2022, Robert Elz wrote:


Currently fd_affix (I mistakenly made it fp_affix in the last message...)
doesn't have a flags parameter, so to do it the way you suggest, we'd need
to alter its signature, bump to 9.99.101 ...


and add some COMPAT_09 goop for backward compability  :)



... (and I haven't yet gotten around
to making my kernel be 98.99.100 which I'm kind of planning to do ...)
and go alter all the calls everywhere, mostly just filling in an extra
arg with a 0.

kre


!DSPAM:63376773211686829812153!




++--+--+
| Paul Goyette   | PGP Key fingerprint: | E-mail addresses:|
| (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com|
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org  |
| & Network Engineer |  | pgoyett...@gmail.com |
++--+--+


Re: CVS commit: src/sys/kern

2022-09-30 Thread Robert Elz
Date:Fri, 30 Sep 2022 16:34:20 -0400
From:Christos Zoulas 
Message-ID:  <232331ad-d501-4547-b730-03590c0c9...@zoulas.com>

  | How about handling exclose there?

That would be possible, but why?   We still need higher level code to
handle the locking, which can also handle cloexec -- the problem we
have now is simply that the relevant call is missing, I don't think adding
it will be hard, but it needs to be done by someone who understands the
locking requirements, and correct exit strategy in this case if an error
occurs (failing to successfully lock a newly created clone would seem to
be a very bizarre case, but still...)   That is, I don't feel competent to
suggest the 3 or 4 lines that ought be added in do_open to fix this (for
just O_CLOEXEC it would be trivial there, as that cannot fail).

Currently fd_affix (I mistakenly made it fp_affix in the last message...)
doesn't have a flags parameter, so to do it the way you suggest, we'd need
to alter its signature, bump to 9.99.101 (and I haven't yet gotten around
to making my kernel be 98.99.100 which I'm kind of planning to do ...)
and go alter all the calls everywhere, mostly just filling in an extra
arg with a 0.

kre



Re: CVS commit: src/sys/kern

2022-09-30 Thread Christos Zoulas


> On Sep 30, 2022, at 10:13 AM, Robert Elz  wrote:
> 
>Date:Thu, 29 Sep 2022 16:47:06 - (UTC)
>From:chris...@astron.com (Christos Zoulas)
>Message-ID:  
> 
>  | I think that the way to go is to:
>  |
>  | 1. Do the fd_set_exclose() in fd_affix(). That will remove most of the 
> calls
>  |to fd_set_exclose() *and* the open-coded versions of it.
>  | 2. Move the open_setfp locking initialization code to fd_affix() and do it
>  |if fp->f_type == DTYPE_VNODE. This should enable locking in all the
>  |appropriate cloners.
> 
> I initially intended to reply and say that decisions where to put stuff
> like that were for someone else (you, dholland, ...) rather than me, as
> I haven't played around much at this level since before vnodes existed.
> 
> But I have been thinking about it, and I disagree with that approach.
> 
> fp_affix() has a job to do, and should be left to do it, without being
> burdened by applying weird side effects, sometimes.   The "do one thing
> and do it well" philosophy applies to more than the commands.
> 
> eg: currently fd_affix() is a void func, but to handle the lock flags
> it would need to be able to fail, and return an error code.  It would
> also need to be able to sleep.   That's just wrong.
> 
> O_CLOEXEC and O_??LOCK are high level open() flags, and deserve to be
> handled somewhere near the upper levels of the open syscall handling,
> not buried in some utility function.

That is the feedback that I wanted. But there were two parts to it. How about
handling exclose there? It is just making sure that the value from flags is 
propagated
to the exclose field.

christos


signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/sys/kern

2022-09-30 Thread Robert Elz
Date:Thu, 29 Sep 2022 16:47:06 - (UTC)
From:chris...@astron.com (Christos Zoulas)
Message-ID:  

  | I think that the way to go is to:
  |
  | 1. Do the fd_set_exclose() in fd_affix(). That will remove most of the calls
  |to fd_set_exclose() *and* the open-coded versions of it.
  | 2. Move the open_setfp locking initialization code to fd_affix() and do it
  |if fp->f_type == DTYPE_VNODE. This should enable locking in all the
  |appropriate cloners.

I initially intended to reply and say that decisions where to put stuff
like that were for someone else (you, dholland, ...) rather than me, as
I haven't played around much at this level since before vnodes existed.

But I have been thinking about it, and I disagree with that approach.

fp_affix() has a job to do, and should be left to do it, without being
burdened by applying weird side effects, sometimes.   The "do one thing
and do it well" philosophy applies to more than the commands.

eg: currently fd_affix() is a void func, but to handle the lock flags
it would need to be able to fail, and return an error code.  It would
also need to be able to sleep.   That's just wrong.

O_CLOEXEC and O_??LOCK are high level open() flags, and deserve to be
handled somewhere near the upper levels of the open syscall handling,
not buried in some utility function.

kre



Re: CVS commit: src/sys/kern

2022-09-29 Thread Christos Zoulas
In article <9275.1664462...@jacaranda.noi.kre.to>,
Robert Elz   wrote:
>Date:Thu, 29 Sep 2022 08:18:28 -0400
>From:"Christos Zoulas" 
>Message-ID:  <20220929121828.06edff...@cvs.netbsd.org>
>
>  | Log Message:
>  | Add fd_set_exclose(). It is probably better to do this automatically in
>  | fd_affix()...
>
>Since that only affects /dev/ptmx I'd suggest fixing it generally for all
>cloning devices (and handling O_??LOCK as well) would be a better method.

I think that the way to go is to:

1. Do the fd_set_exclose() in fd_affix(). That will remove most of the calls
   to fd_set_exclose() *and* the open-coded versions of it.
2. Move the open_setfp locking initialization code to fd_affix() and do it
   if fp->f_type == DTYPE_VNODE. This should enable locking in all the
   appropriate cloners.

Best,

christos



Re: CVS commit: src/sys/kern

2022-09-29 Thread Robert Elz
Date:Thu, 29 Sep 2022 08:18:28 -0400
From:"Christos Zoulas" 
Message-ID:  <20220929121828.06edff...@cvs.netbsd.org>

  | Log Message:
  | Add fd_set_exclose(). It is probably better to do this automatically in
  | fd_affix()...

Since that only affects /dev/ptmx I'd suggest fixing it generally for all
cloning devices (and handling O_??LOCK as well) would be a better method.

kre



Re: CVS commit: src/lib/libc/stdlib

2022-09-29 Thread Robert Elz
Date:Thu, 29 Sep 2022 08:18:49 -0400
From:Christos Zoulas 
Message-ID:  

  | Yes, I had forgotten about the need to do this explicitly...

  | > On Sep 28, 2022, at 10:23 PM, Robert Elz  wrote:
  | > 
  | > Apologies, I did not read the code closely enough, there must be a bug
  | > in the way the clone file descriptor is created.

I think I know the cause now, and it isn't anything specific to /dev/ptmx
it would affect all cloning devices.

In kern/vfs_syscalls.c do_open() we see the following:

if (vp == NULL) {
fd_abort(p, fp, indx);
error = fd_dupopen(dupfd, dupfd_move, flags, );
if (error)
return error;
*fd = indx;   
} else {
error = open_setfp(l, fp, vp, indx, flags);
if (error)
return error;
VOP_UNLOCK(vp);
*fd = indx;
fd_affix(p, fp, indx);
}

The vp==NULL case is where cloning devices are handled.   fd_dupopen()
arranges top make the duplicate happen.

It attempts to handle O_CLOEXEC thus:

error = fd_dup(fp, 0, newp, ff->ff_exclose);

where ff_exclose is the close_on_exec bit for the old file descriptor
(the one being duplicated) which is just fine for other calls of fd_dupopen().

The vp!=NULL case above is for "normal" opens, and open_setfp() is what
handles O_CLOEXEC - that's what (eventually) makes ff_exclose be true
(it also handles O_EXLOCK and O_SHLOCK.

None of that is being done in the cloning device open case, so those 3
flags simply cannot work, as the code is written currently, for these
devices.

I'm not about to go playing in this very fiddly piece of code, but I'm
kind of hoping that dholland@ might know the right magic sequence to make
this issue go away.

It probably means a call to open_setfp() with some parameters or other,
somewhere in the vp==NULL side of that if statement, though, just what
I have not attempted to work out.

kre



Re: CVS commit: src/lib/libc/stdlib

2022-09-29 Thread Christos Zoulas
Yes, I had forgotten about the need to do this explicitly...

christos

> On Sep 28, 2022, at 10:23 PM, Robert Elz  wrote:
> 
> Apologies, I did not read the code closely enough, there must be a bug
> in the way the clone file descriptor is created.
> 
> kre



signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/lib/libc/stdlib

2022-09-28 Thread Robert Elz
Apologies, I did not read the code closely enough, there must be a bug
in the way the clone file descriptor is created.

kre


Re: CVS commit: src/lib/libc/stdlib

2022-09-28 Thread Robert Elz
Date:Wed, 28 Sep 2022 20:48:53 -0400
From:"David H. Gutteridge" 
Message-ID:  <9c9c8e9d9384338320b47dabfdc94...@gutteridge.ca>

   
  | (O_CLOEXEC, on the other hand, is ignored, at the moment.)

No it isn't, your test is faulty, O_CLOEXEC is a different
kind of flag, applies at a different level, and is fetched
a different way.

That's what dholland@ tried to tell you a few days ago.

kre
  |
  | $ cat open_test.c
  | #include 
  | #include 
  | #include 
  |
  | int main(int argc, char* argv[])
  | {
  | int fd, flags;
  |
  | printf("Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.\n");
  | if ((fd = open("/etc/release", O_RDONLY | O_NONBLOCK | O_CLOEXEC)) < 0)
  | printf("Failed to get file handle.\n");
  | else {
  | printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD));
  | printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0));
  | close(fd);
  | }
  |
  | printf("POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.\n");
  | if ((fd = open("/dev/ptmx", O_RDWR | O_NOCTTY | O_NONBLOCK | 
  | O_CLOEXEC)) < 0)
  | printf("Failed to open /dev/ptmx.\n");
  | else {
  | printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD));
  | printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0));
  | close(fd);
  | }
  |
  | return 0;
  | }
  |
  | $ ./open_test
  | Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.
  | Descriptor flags: 1
  | Status flags: 4
  | POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.
  | Descriptor flags: 0
  | Status flags: 6
  |
  | Regards,
  |
  | Dave
  |


Re: CVS commit: src/lib/libc/stdlib

2022-09-28 Thread David H. Gutteridge

On Wed, 28 Sep 2022, at 15:07:41 - (UTC), Christos Zoulas wrote:

In article <20220928003547.D2375FA92%cvs.NetBSD.org@localhost>,
David H. Gutteridge  wrote:

-=-=-=-=-=-

Module Name:src
Committed By:   gutteridge
Date:   Wed Sep 28 00:35:47 UTC 2022

Modified Files:
src/lib/libc/stdlib: posix_openpt.3

Log Message:
posix_openpt.3: reflect flag changes from r. 1.44 of tty_ptm.c

Some flags are now accepted, others are still ignored. (E.g., other
BSDs would return EINVAL if O_RDWR wasn't passed, and we now accept
O_NONBLOCK but not O_CLOEXEC.)


How so?

#define FCNTLFLAGS  
(FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FDSYNC|FRSYNC|FALTIO|\

  ^
FDIRECT|FNOSIGPIPE)
/* bits to save after open(2) */
#define FMASK   (FREAD|FWRITE|FCNTLFLAGS|FEXEC)
 ^^

Best,

christos


Hi Christos,

I'm sorry, I don't follow your question? I wrote "we now accept
O_NONBLOCK...", which is what I'm reading you're saying too?

(O_CLOEXEC, on the other hand, is ignored, at the moment.)

$ cat open_test.c
#include 
#include 
#include 

int main(int argc, char* argv[])
{
int fd, flags;

printf("Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.\n");
if ((fd = open("/etc/release", O_RDONLY | O_NONBLOCK | O_CLOEXEC)) < 0)
printf("Failed to get file handle.\n");
else {
printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD));
printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0));
close(fd);
}

printf("POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.\n");
	if ((fd = open("/dev/ptmx", O_RDWR | O_NOCTTY | O_NONBLOCK | 
O_CLOEXEC)) < 0)

printf("Failed to open /dev/ptmx.\n");
else {
printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD));
printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0));
close(fd);
}

return 0;
}

$ ./open_test
Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.
Descriptor flags: 1
Status flags: 4
POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.
Descriptor flags: 0
Status flags: 6

Regards,

Dave


Re: CVS commit: src/usr.bin/make

2022-09-28 Thread Christos Zoulas
In article <20220928163447.b0bf6f...@cvs.netbsd.org>,
Simon J. Gerraty  wrote:
>-=-=-=-=-=-
>
>Module Name:   src
>Committed By:  sjg
>Date:  Wed Sep 28 16:34:47 UTC 2022
>
>Modified Files:
>   src/usr.bin/make: main.c meta.c
>
>Log Message:
>Don't ignore return from snprintf or getcwd

But if you check, you should also check < 0

christos



Re: CVS commit: src/distrib/notes/sparc64

2022-09-28 Thread Jason Thorpe


> On Sep 28, 2022, at 11:37 AM, Charlotte Koch  wrote:

> Maybe it makes sense to write an article in our own wiki instead? From
> what I gather, reprogramming these NVRAM/clock chips is a *very* common
> task. (In fact, I'm going to do it myself with the Sun Blade 100 that
> I'm getting tomorrow!)

Indeed, I think maintaining our own set of technical articles (citing other 
sources as appropriate, of course), would be a good thing.

-- thorpej



Re: CVS commit: src/distrib/notes/sparc64

2022-09-28 Thread Charlotte Koch

On Thu, 29 Sep 2022, Izumi Tsutsui wrote:


Modified Files:
src/distrib/notes/sparc64: prep

Log Message:
Avoid dead link to the NVRAM/Hostid FAQ


Maybe this mirror is better than archive.org:
http://www.obsolyte.com/sunFAQ/faq_nvram.html

---
Izumi Tsutsui


That link is a mirror to an older version of the document (1998 versus
2004), but indeed it certainly feels nicer than relying upon the Wayback
Machine. Besides, I *wanted* to use something better than archive.org in
the first place, but couldn't find anything. I think I will change it to
the obsolyte.com link, thanks for the pointer.

Maybe it makes sense to write an article in our own wiki instead? From
what I gather, reprogramming these NVRAM/clock chips is a *very* common
task. (In fact, I'm going to do it myself with the Sun Blade 100 that
I'm getting tomorrow!)

Charlotte


Re: CVS commit: src/distrib/notes/sparc64

2022-09-28 Thread Izumi Tsutsui
> Modified Files:
>   src/distrib/notes/sparc64: prep
> 
> Log Message:
> Avoid dead link to the NVRAM/Hostid FAQ

Maybe this mirror is better than archive.org:
 http://www.obsolyte.com/sunFAQ/faq_nvram.html

---
Izumi Tsutsui


Re: CVS commit: src/lib/libc/stdlib

2022-09-28 Thread Christos Zoulas
In article <20220928003547.d2375f...@cvs.netbsd.org>,
David H. Gutteridge  wrote:
>-=-=-=-=-=-
>
>Module Name:   src
>Committed By:  gutteridge
>Date:  Wed Sep 28 00:35:47 UTC 2022
>
>Modified Files:
>   src/lib/libc/stdlib: posix_openpt.3
>
>Log Message:
>posix_openpt.3: reflect flag changes from r. 1.44 of tty_ptm.c
>
>Some flags are now accepted, others are still ignored. (E.g., other
>BSDs would return EINVAL if O_RDWR wasn't passed, and we now accept
>O_NONBLOCK but not O_CLOEXEC.)

How so?

#define FCNTLFLAGS  (FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FDSYNC|FRSYNC|FALTIO|\
   ^
 FDIRECT|FNOSIGPIPE)
/* bits to save after open(2) */
#define FMASK   (FREAD|FWRITE|FCNTLFLAGS|FEXEC)
  ^^

Best,

christos



Re: CVS commit: src/sys/arch/arm/ti

2022-09-25 Thread Taylor R Campbell
> Module Name:src
> Committed By:   riastradh
> Date:   Sun Sep 25 07:50:32 UTC 2022
> 
> Modified Files:
> src/sys/arch/arm/ti: ti_fb.c ti_lcdc.c
> 
> Log Message:
> tilcdc(4): Set is_console on the drm device, not the fb child.
> 
> The drm device is represented by a rockchip,display-subsystem node in
> the device tree.  The fb child is a purely software abstraction used
> by drm.

This was supposed to read:

The drm device is represented by a ti,am33xx-tilcdc node in
the device tree.  The fb child is a purely software
abstraction used by drm.


Re: CVS commit: src/sys/arch/arm/sunxi

2022-09-25 Thread Taylor R Campbell
> Module Name:src
> Committed By:   riastradh
> Date:   Sun Sep 25 07:50:23 UTC 2022
> 
> Modified Files:
> src/sys/arch/arm/sunxi: sunxi_drm.c sunxi_fb.c
> 
> Log Message:
> sunxidrm: Set is_console on the drm device, not the fb child.
> 
> The drm device is represented by a rockchip,display-subsystem node in
> the device tree.  The fb child is a purely software abstraction used
> by drm.

This was supposed to read:

The drm device is represented by an
allwinner,sun*i-*-display-engine node in the device tree.  The
fb child is a purely software abstraction used by drm.


Re: CVS commit: src/sys/dev/pci

2022-09-21 Thread Nick Hudson

Hi,


On 21/09/2022 08:56, matthew green wrote:

this asserts for me.  perhaps ryo@ didn't have LOCKDEBUG?


I had LOCKDEBUG on, but not DEBUG. 
https://nxr.netbsd.org/xref/src/sys/sys/systm.h#760
I see that adding options DEBUG does indeed cause a panic with 
ASSERT_SLEEPABLE()...


ah!  that would explain it.  nick asked me to test switching
sc_mutex to IPL_SOFTCLOCK mutex, and this works, though it
exposed another bug in reboot that i also needed a fix for
(ifnet locked), see patch below with both fixes.

there's another with the rev 1.32 and rev 1.33+patch driver
when doing "ifconfig aq0 down up", i get this shortly after
and packets no longer flow:

aq0: watchdog timeout -- resetting
aq0: aq_handle_reset_work: INTR_MASK/STATUS = 0001/
aq0: aq_handle_reset_work: TXring[0] HEAD/TAIL=26/51
aq0: aq_handle_reset_work: TXring[1] HEAD/TAIL=153/170
aq0: aq_handle_reset_work: TXring[2] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[3] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[4] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[5] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[6] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[7] HEAD/TAIL=0/0



I think this diff is slightly better and might even fix the problem
you're seeing with "ifconfig aq0 down up"

Nick
Index: sys/dev/pci/if_aq.c
===
RCS file: /cvsroot/src/sys/dev/pci/if_aq.c,v
retrieving revision 1.33
diff -u -p -r1.33 if_aq.c
--- sys/dev/pci/if_aq.c	16 Sep 2022 03:55:53 -	1.33
+++ sys/dev/pci/if_aq.c	21 Sep 2022 08:15:26 -
@@ -1278,7 +1278,7 @@ aq_attach(device_t parent, device_t self
 	int error;
 
 	sc->sc_dev = self;
-	mutex_init(>sc_mutex, MUTEX_DEFAULT, IPL_NET);
+	mutex_init(>sc_mutex, MUTEX_DEFAULT, IPL_SOFTCLOCK);
 	mutex_init(>sc_mpi_mutex, MUTEX_DEFAULT, IPL_NET);
 
 	sc->sc_pc = pc = pa->pa_pc;
@@ -1588,7 +1588,9 @@ aq_detach(device_t self, int flags __unu
 
 	if (sc->sc_iosize != 0) {
 		if (ifp->if_softc != NULL) {
-			aq_stop(ifp, 0);
+			IFNET_LOCK(ifp);
+			aq_stop(ifp, 1);
+			IFNET_UNLOCK(ifp);
 		}
 
 		for (i = 0; i < AQ_NINTR_MAX; i++) {
@@ -1616,8 +1618,6 @@ aq_detach(device_t self, int flags __unu
 		sc->sc_iosize = 0;
 	}
 
-	callout_stop(>sc_tick_ch);
-
 #if NSYSMON_ENVSYS > 0
 	if (sc->sc_sme != NULL) {
 		/* all sensors associated with this will also be detached */


re: CVS commit: src/sys/dev/pci

2022-09-21 Thread matthew green
> >this asserts for me.  perhaps ryo@ didn't have LOCKDEBUG?
>
> I had LOCKDEBUG on, but not DEBUG. 
> https://nxr.netbsd.org/xref/src/sys/sys/systm.h#760
> I see that adding options DEBUG does indeed cause a panic with 
> ASSERT_SLEEPABLE()...

ah!  that would explain it.  nick asked me to test switching
sc_mutex to IPL_SOFTCLOCK mutex, and this works, though it
exposed another bug in reboot that i also needed a fix for
(ifnet locked), see patch below with both fixes.

there's another with the rev 1.32 and rev 1.33+patch driver
when doing "ifconfig aq0 down up", i get this shortly after
and packets no longer flow:

aq0: watchdog timeout -- resetting
aq0: aq_handle_reset_work: INTR_MASK/STATUS = 0001/
aq0: aq_handle_reset_work: TXring[0] HEAD/TAIL=26/51
aq0: aq_handle_reset_work: TXring[1] HEAD/TAIL=153/170
aq0: aq_handle_reset_work: TXring[2] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[3] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[4] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[5] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[6] HEAD/TAIL=0/0
aq0: aq_handle_reset_work: TXring[7] HEAD/TAIL=0/0


.mrg.


Index: if_aq.c
===
RCS file: /cvsroot/src/sys/dev/pci/if_aq.c,v
retrieving revision 1.33
diff -p -u -r1.33 if_aq.c
--- if_aq.c 16 Sep 2022 03:55:53 -  1.33
+++ if_aq.c 21 Sep 2022 07:52:59 -
@@ -1278,7 +1278,7 @@ aq_attach(device_t parent, device_t self
int error;
 
sc->sc_dev = self;
-   mutex_init(>sc_mutex, MUTEX_DEFAULT, IPL_NET);
+   mutex_init(>sc_mutex, MUTEX_DEFAULT, IPL_SOFTCLOCK);
mutex_init(>sc_mpi_mutex, MUTEX_DEFAULT, IPL_NET);
 
sc->sc_pc = pc = pa->pa_pc;
@@ -1588,7 +1588,9 @@ aq_detach(device_t self, int flags __unu
 
if (sc->sc_iosize != 0) {
if (ifp->if_softc != NULL) {
+   IFNET_LOCK(ifp);
aq_stop(ifp, 0);
+   IFNET_UNLOCK(ifp);
}
 
for (i = 0; i < AQ_NINTR_MAX; i++) {


Re: CVS commit: src/sys/dev/pci

2022-09-20 Thread Ryo Shimizu


>> Module Name: src
>> Committed By:skrll
>> Date:Fri Sep 16 03:55:53 UTC 2022
>>
>> Modified Files:
>>  src/sys/dev/pci: if_aq.c
>>
>> Log Message:
>> Some MP improvements
>>
>> - Remove use of IFF_OACTIVE
>>
>> - Remove use of if_timer and provide an MP safe multiqueue watchdog
>>
>> - Sprinkle some lock assertions.
>>
>> Tested by ryo@. Thanks.
>
>this asserts for me.  perhaps ryo@ didn't have LOCKDEBUG?

I had LOCKDEBUG on, but not DEBUG. 
https://nxr.netbsd.org/xref/src/sys/sys/systm.h#760
I see that adding options DEBUG does indeed cause a panic with 
ASSERT_SLEEPABLE()...

-- 
ryo shimizu


re: CVS commit: src/sys/dev/pci

2022-09-20 Thread matthew green
"Nick Hudson" writes:
> Module Name:  src
> Committed By: skrll
> Date: Fri Sep 16 03:55:53 UTC 2022
>
> Modified Files:
>   src/sys/dev/pci: if_aq.c
>
> Log Message:
> Some MP improvements
>
> - Remove use of IFF_OACTIVE
>
> - Remove use of if_timer and provide an MP safe multiqueue watchdog
>
> - Sprinkle some lock assertions.
>
> Tested by ryo@. Thanks.

this asserts for me.  perhaps ryo@ didn't have LOCKDEBUG?

the problem is that aq_init() calls AQ_LOCK(sc) -- this is
a spin mutex -- and then calls aq_init_locked().  however,
aq_init_locked() calls ASSERT_SLEEPABLE() since it has a
code path that calls callout_halt() (which wants to sleep.)

ie, the function that expects to be called with a spin
mutex held also calls ASSERT_SLEEPABLE().  even if i were
to comment that call, the later call to callout_halt() is
the real problem.

the only way i saw to handle this without investing some
other method to invoke the callout_halt() from another lwp
was to change aq_stop_locked() to return a value that says
that callout_halt is needed here.  that needs to be passed
upto aq_stop() as well as aq_init(), both of which call
aq_stop_locked().  it needs a little re-arrange due to
aq_init_locked() already returning a value for aq_init()
to return directly (and aq_init() is where the mutex will
be dropped, and it's safe to callout_halt().)

for now i'm running with rev 1.32.

thanks.


.mrg.


Re: CVS commit: src/sys/dev/nvmm

2022-09-15 Thread Reinoud Zandijk
Hi Taylor,

Thanks for updating NVMM! Would it be a good idea to sync our code with
DragonBSDs? AFAICS they kept the code compiling for NetBSD too. To have a
common code base?

With regards,
Reinoud

On Tue, Sep 13, 2022 at 08:10:04PM +, Taylor R Campbell wrote:
> Module Name:  src
> Committed By: riastradh
> Date: Tue Sep 13 20:10:04 UTC 2022
> 
> Modified Files:
>   src/sys/dev/nvmm: nvmm.c nvmm_internal.h
>   src/sys/dev/nvmm/x86: nvmm_x86_vmx.c
> 
> Log Message:
> nvmm(4): Add suspend/resume support.
> 
> New MD nvmm_impl callbacks:
> 
> - .suspend_interrupt forces all VMs on all physical CPUs to exit.
> - .vcpu_suspend suspends an individual vCPU on a machine.
> - .machine_suspend suspends an individual machine.
> - .suspend suspends the whole system.
> - .resume resumes the whole system.
> - .machine_resume resumes an individual machine.
> - .vcpu_resume resumes an indidivudal vCPU on a machine.
> 
> Suspending nvmm:
> 
> 1. causes new VM operations (ioctl and close) to block until resumed,
> 2. uses .suspend_interrupt to interrupt any concurrent and force them
>to return early, and then
> 3. uses the various suspend callbacks to suspend all vCPUs, machines,
>and the whole system -- all vCPUs before the machine they're on,
>and all machines before the system.
> 
> Resuming nvmm does the reverse of (3) -- resume system, resume each
> machine and then the vCPUs on that machine -- and then unblocks
> operations.
> 
> Implemented only for x86-vmx for now:
> 
> - suspend_interrupt triggers a TLB IPI to cause VM exits;
> - vcpu_suspend issues VMCLEAR to force any in-CPU state to be written
>   to memory;
> - machine_suspend does nothing;
> - suspend does VMXOFF on all CPUs;
> - resume does VMXON on all CPUs;
> - machine_resume does nothing; and
> - vcpu_resume just marks each vCPU as valid but inactive so
>   subsequent use will clear it and load it with vmptrld.
> 
> x86-svm left as an exercise for the reader.
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.46 -r1.47 src/sys/dev/nvmm/nvmm.c
> cvs rdiff -u -r1.20 -r1.21 src/sys/dev/nvmm/nvmm_internal.h
> cvs rdiff -u -r1.84 -r1.85 src/sys/dev/nvmm/x86/nvmm_x86_vmx.c
> 
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.


Re: CVS commit: src/sys/fs/union

2022-09-12 Thread Rin Okuyama

Thanks for quick fix!

rin

On 2022/09/12 22:10, Christos Zoulas wrote:

Yup!

christos


On Sep 12, 2022, at 5:04 AM, Rin Okuyama  wrote:

Hi,

On 2022/09/12 0:42, Christos Zoulas wrote:

@@ -420,11 +423,11 @@ union_statvfs(struct mount *mp, struct s
  {
int error;
struct union_mount *um = MOUNTTOUNIONMOUNT(mp);
-   struct statvfs *sbuf = malloc(sizeof(*sbuf), M_TEMP, M_WAITOK | M_ZERO);
+   struct statvfs *sbuf = kmem_alloc(sizeof(*sbuf), KM_SLEEP);
unsigned long lbsize;
#ifdef UNION_DIAGNOSTIC
-   printf("union_statvfs(mp = %p, lvp = %p, uvp = %p)\n", mp,
+   printf("%s(mp = %p, lvp = %p, uvp = %p)\n", __func__, mp,
um->um_lowervp, um->um_uppervp);
  #endif



kmem_zalloc() here?

Thanks,
rin




Re: CVS commit: src/sys/fs/union

2022-09-12 Thread Christos Zoulas
Yup!

christos

> On Sep 12, 2022, at 5:04 AM, Rin Okuyama  wrote:
> 
> Hi,
> 
> On 2022/09/12 0:42, Christos Zoulas wrote:
>> @@ -420,11 +423,11 @@ union_statvfs(struct mount *mp, struct s
>>  {
>>  int error;
>>  struct union_mount *um = MOUNTTOUNIONMOUNT(mp);
>> -struct statvfs *sbuf = malloc(sizeof(*sbuf), M_TEMP, M_WAITOK | M_ZERO);
>> +struct statvfs *sbuf = kmem_alloc(sizeof(*sbuf), KM_SLEEP);
>>  unsigned long lbsize;
>>#ifdef UNION_DIAGNOSTIC
>> -printf("union_statvfs(mp = %p, lvp = %p, uvp = %p)\n", mp,
>> +printf("%s(mp = %p, lvp = %p, uvp = %p)\n", __func__, mp,
>>  um->um_lowervp, um->um_uppervp);
>>  #endif
>> 
> 
> kmem_zalloc() here?
> 
> Thanks,
> rin



signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/sys/fs/union

2022-09-12 Thread Rin Okuyama

Hi,

On 2022/09/12 0:42, Christos Zoulas wrote:

@@ -420,11 +423,11 @@ union_statvfs(struct mount *mp, struct s
  {
int error;
struct union_mount *um = MOUNTTOUNIONMOUNT(mp);
-   struct statvfs *sbuf = malloc(sizeof(*sbuf), M_TEMP, M_WAITOK | M_ZERO);
+   struct statvfs *sbuf = kmem_alloc(sizeof(*sbuf), KM_SLEEP);
unsigned long lbsize;
  
  #ifdef UNION_DIAGNOSTIC

-   printf("union_statvfs(mp = %p, lvp = %p, uvp = %p)\n", mp,
+   printf("%s(mp = %p, lvp = %p, uvp = %p)\n", __func__, mp,
um->um_lowervp, um->um_uppervp);
  #endif
  


kmem_zalloc() here?

Thanks,
rin


Re: CVS commit: src/distrib/sets/lists/debug

2022-08-28 Thread David Holland
On Sun, Aug 28, 2022 at 03:30:41AM -0400, Christos Zoulas wrote:
 > Modified Files:
 >  src/distrib/sets/lists/debug: mi
 > 
 > Log Message:
 > fix sets

Bah. Sorry about that. Been too long since the last time...

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src

2022-08-28 Thread Harold Gutch
Hi,

On Sun, Aug 28, 2022 at 03:26:28PM +0300, Valery Ushakov wrote:
> On Sun, Aug 28, 2022 at 10:48:17 +, Harold Gutch wrote:
> 
> > Change back various occurrences of \*[Le], \*[Ge] (less/greater equal)
> > and \*(ua (upwards arrow) to literal "<=", ">=" and "^" whenever
> > appropriate (e.g., in code examples).
> 
> Thanks.  Using your commit as a pretext (and in no way intended as
> censure) ...
> 
> I wonder if we should switch to consistently use \(ha (a groff
> extension also understood by heirloom) for ascii circumflex.  Troff
> turns plain ^ into circumflex accent which is visually very small and
> you cannot find it, obviously, when you search pdf output for "^".

No strong opinion here - my test case earlier was "man csh" when
LC_CTYPE=en_US.UTF-8 where \*(ua renders an actual error.  Both ^ and
\ha render a caret here, so no objection from my side to such a
change.


> Check out e.g.
> 
>   groff -Tps -mandoc csh.1 > csh.ps && ps2pdf csh.ps csh.pdf
> 
> (that postscript output has a lot of other problems, b/c examples are
> not in monospaced font, etc, which makes the ^ size problem even worse
> visually).

At least on macOS (and using pstopdf from there), \ha seems like a 
small improvement over ^ in that Preview now actually finds
occurrences of ^ if you search for it.  It doesn't highlight the
carets but it now returns the pages with one on it (it doesn't find
the pages only containing ones that come from a ^ in the source).


  Harold


  1   2   3   4   5   6   7   8   9   10   >