Fwd: futex(2) not working in inherited mmap'd anon memory

2021-10-26 Thread Amit Kulkarni
-- Forwarded message -
From: Thomas Munro 
Date: Tue, Oct 26, 2021 at 5:36 AM
Subject: futex(2) not working in inherited mmap'd anon memory
To: 


Hello,

When I do mmap(MAP_ANONYMOUS | MAP_SHARED) and then fork(), it seems
that futex(2) wakeups are not delivered between child and parent in
that memory.  It does work as expected if I instead use
shmget(IPC_PRIVATE).

Below is a standalone test program.  I tested it with the four OSes
mentioned, and the two shmem types depending on that #if, and all
worked as expected except the OpenBSD/mmap case, which hangs.

Is it a bug?

$ uname -a
OpenBSD openbsd6.localdomain 6.9 GENERIC.MP#473 amd64

Thanks,

=== 8< ===

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#if defined(__linux__)
#include 
#include 
#elif defined(__OpenBSD__)
#include 
#include 
#elif defined(__FreeBSD__)
#include 
#include 
#elif defined(__APPLE__)
#define UL_COMPARE_AND_WAIT_SHARED 3
#define ULF_WAKE_ALL 0x0100
extern int __ulock_wait(uint32_t operation, void *addr, uint64_t
value, uint32_t timeout);
extern int __ulock_wake(uint32_t operation, void *addr, uint64_t wake_value);
#endif

static int
my_futex_wait_u32(void *fut, uint32_t value, struct timespec *timeout)
{
#if defined(__linux__)
if (syscall(SYS_futex, fut, FUTEX_WAIT, value, timeout, 0, 0) == 0)
return 0;
#elif defined(__OpenBSD__)
if ((errno = futex(fut, FUTEX_WAIT, (int) value, timeout, NULL)) == 0)
return 0;
if (errno == ECANCELED)
errno = EINTR;
#elif defined(__FreeBSD__)
if (_umtx_op(fut, UMTX_OP_WAIT_UINT, value, 0, timeout) == 0)
return 0;
#elif defined (__APPLE__)
if (__ulock_wait(UL_COMPARE_AND_WAIT_SHARED, (void *) fut,
value, timeout ? timeout->tv_sec * 100 + timeout->tv_nsec / 1000 :
0) >
= 0)
return 0;
#else
errno = ENOSYS;
#endif

return -1;
}

static int
my_futex_wake(void *fut, int nwaiters)
{
#if defined(__linux__)
if (syscall(SYS_futex, fut, FUTEX_WAKE, nwaiters, NULL, 0, 0) >= 0)
return 0;
#elif defined(__OpenBSD__)
if (futex(fut, FUTEX_WAKE, nwaiters, NULL, NULL) >= 0)
return 0;
#elif defined(__FreeBSD__)
if (_umtx_op(fut, UMTX_OP_WAKE, nwaiters, 0, 0) == 0)
return 0;
#elif defined (__APPLE__)
if (__ulock_wake(UL_COMPARE_AND_WAIT_SHARED | (nwaiters > 1 ?
ULF_WAKE_ALL : 0), (void *) fut, 0) >= 0)
return 0;
if (errno == ENOENT)
return 0;
#else
errno = ENOSYS;
#endif

return -1;
}

int
main(int argc, char *argv[])
{
pid_t pid;
uint32_t *memory;
int status;

#if 1
memory = mmap(NULL, sizeof(uint32_t), PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_SHARED, -1, 0);
if (memory == MAP_FAILED) {
perror("mmap");
return EXIT_FAILURE;
}
#else
int shm_id;

shm_id = shmget(IPC_PRIVATE, sizeof(uint32_t), IPC_CREAT | 0666);
if (shm_id < 0) {
perror("shmget");
return EXIT_FAILURE;
}
memory = shmat(shm_id, NULL, 0);
if ((intptr_t) memory == -1) {
perror("shmat");
return EXIT_FAILURE;
}
#endif

*memory = 42;

pid = fork();
if (pid == -1) {
perror("fork");
return EXIT_FAILURE;
} else if (pid > 0) {
printf("hello from parent, will wait for futex...\n");
if (my_futex_wait_u32(memory, 42, NULL) < 0) {
perror("futex_wait");
wait();
return EXIT_FAILURE;
}
wait();
return EXIT_SUCCESS;
} else {
printf("hello from child, will wake futex...\n");
sleep(1);
if (my_futex_wake(memory, INT_MAX) < 0) {
perror("futex_wake");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
}



Fwd: explicit_bzero vs. alternatives

2020-08-10 Thread Amit Kulkarni
moving to tech@

-- Forwarded message -
From: Philipp Klaus Krause 
Date: Mon, Aug 10, 2020 at 4:34 AM
Subject: explicit_bzero vs. alternatives
To: 


OpenBSD has the explicit_bzero function to reliably (i.e. even if not
observable in the C abstract machine) overwrite memory with zeroes.

WG14 is currently considering adding similar functionality to C2X.

Considered options include:

* A function like explicit_bzero or memset_explicit, that overwrites the
memory with a known value.
* A function like memclear, that overwrites the memory in an
implementation-defined manner, possibly using random data.

Is there a rationale why OpenBSD went with their explicit_bzero design?
Were alternatives considered and rejected?

Philipp



Rename ps_flags to ps_pwmflags in /sys

2020-03-12 Thread Amit Kulkarni
Hi,

These files are only hooked up to the build on arm64, and I don't have this 
arch, so this is not even compile tested by me. Hopefully, did not miss 
anything in grepping. Can somebody with arm64 please compile and report any 
issues?

Thanks



Index: dev/fdt/amlpwm.c
===
RCS file: /cvs/src/sys/dev/fdt/amlpwm.c,v
retrieving revision 1.1
diff -u -p -u -p -r1.1 amlpwm.c
--- dev/fdt/amlpwm.c30 Sep 2019 20:42:45 -  1.1
+++ dev/fdt/amlpwm.c13 Mar 2020 01:38:32 -
@@ -181,7 +181,7 @@ amlpwm_set_state(void *cookie, uint32_t 
return EINVAL;
 
/* Hardware doesn't support polarity inversion. */
-   if (ps->ps_flags & PWM_POLARITY_INVERTED)
+   if (ps->ps_pwmflags & PWM_POLARITY_INVERTED)
return EINVAL;
 
if (!ps->ps_enabled) {
Index: dev/fdt/rkpwm.c
===
RCS file: /cvs/src/sys/dev/fdt/rkpwm.c,v
retrieving revision 1.1
diff -u -p -u -p -r1.1 rkpwm.c
--- dev/fdt/rkpwm.c 3 Dec 2019 09:08:48 -   1.1
+++ dev/fdt/rkpwm.c 13 Mar 2020 01:38:32 -
@@ -170,7 +170,7 @@ rkpwm_set_state(void *cookie, uint32_t *
HCLR4(sc, PWM_V2_CTRL, PWM_V2_CTRL_INACTIVE_POSITIVE);
HCLR4(sc, PWM_V2_CTRL, PWM_V2_CTRL_DUTY_POSITIVE);
 
-   if (ps->ps_flags & PWM_POLARITY_INVERTED)
+   if (ps->ps_pwmflags & PWM_POLARITY_INVERTED)
HSET4(sc, PWM_V2_CTRL, PWM_V2_CTRL_INACTIVE_POSITIVE);
else
HSET4(sc, PWM_V2_CTRL, PWM_V2_CTRL_DUTY_POSITIVE);
Index: dev/fdt/sxipwm.c
===
RCS file: /cvs/src/sys/dev/fdt/sxipwm.c,v
retrieving revision 1.1
diff -u -p -u -p -r1.1 sxipwm.c
--- dev/fdt/sxipwm.c21 Oct 2019 20:52:33 -  1.1
+++ dev/fdt/sxipwm.c13 Mar 2020 01:38:32 -
@@ -217,7 +217,7 @@ sxipwm_set_state(void *cookie, uint32_t 
reg |= (PWM_CH0_EN | SCLK_CH0_GATING);
else
reg &= ~(PWM_CH0_EN | SCLK_CH0_GATING);
-   if (ps->ps_flags & PWM_POLARITY_INVERTED)
+   if (ps->ps_pwmflags & PWM_POLARITY_INVERTED)
reg &= ~PWM_CH0_ACT_STA;
else
reg |= PWM_CH0_ACT_STA;
Index: dev/ofw/ofw_misc.c
===
RCS file: /cvs/src/sys/dev/ofw/ofw_misc.c,v
retrieving revision 1.14
diff -u -p -u -p -r1.14 ofw_misc.c
--- dev/ofw/ofw_misc.c  1 Mar 2020 18:00:12 -   1.14
+++ dev/ofw/ofw_misc.c  13 Mar 2020 01:38:32 -
@@ -304,7 +304,7 @@ pwm_init_state(uint32_t *cells, struct p
if (pd->pd_cells > 2)
ps->ps_period = cells[2];
if (pd->pd_cells > 3)
-   ps->ps_flags = cells[3];
+   ps->ps_pwmflags = cells[3];
return 0;
}
}
Index: dev/ofw/ofw_misc.h
===
RCS file: /cvs/src/sys/dev/ofw/ofw_misc.h,v
retrieving revision 1.10
diff -u -p -u -p -r1.10 ofw_misc.h
--- dev/ofw/ofw_misc.h  21 Feb 2020 15:46:16 -  1.10
+++ dev/ofw/ofw_misc.h  13 Mar 2020 01:38:32 -
@@ -93,7 +93,7 @@ int   sfp_get_sffpage(uint32_t, struct if_
 struct pwm_state {
uint32_t ps_period;
uint32_t ps_pulse_width;
-   uint32_t ps_flags;
+   uint32_t ps_pwmflags;
int ps_enabled;
 };
 



Re: Rename ps_flags -> ps_sigflags in struct sigacts

2020-03-12 Thread Amit Kulkarni
> > 
> > I have the same issue with ps_flags vs ps_flags and I think it resulted in
> > some major confusion for others as well. See inline.
> 
> I agree this would make grep/looking at code easier :o)
> 
> > > @@ -336,9 +336,9 @@ setsigvec(struct proc *p, int signum, st
> > >   ps->ps_catchmask[signum] = sa->sa_mask &~ sigcantmask;
> > >   if (signum == SIGCHLD) {
> > >   if (sa->sa_flags & SA_NOCLDSTOP)
> > > - atomic_setbits_int(>ps_flags, SAS_NOCLDSTOP);
> > > + atomic_setbits_int(>ps_sigflags, SAS_NOCLDSTOP);
> > >   else
> > > - atomic_clearbits_int(>ps_flags, SAS_NOCLDSTOP);
> > > + atomic_clearbits_int(>ps_sigflags, SAS_NOCLDSTOP);
> > 
> > I doubt these should be atomic functions here. The sigacts ps_flags are
> > don't need atomic updates (especially since most other calls are not
> > atomic.
> 
> Indeed, this is a leftover from the introduction of P_SIGSUSPEND.
> Before that per-thread flags where used and needed atomic operations.
> 
> That said we can rename it then remove the atomic operations :)
> 
> ok mpi@


Hi,

Per both of your feedback, resending entire diff which removes the atomic 
set/clear bits in struct sigacts. Will send the similar name change diff for 
the ps_pwmflags in a separate email.

Thanks

Index: kern/init_main.c
===
RCS file: /cvs/src/sys/kern/init_main.c,v
retrieving revision 1.296
diff -u -p -u -p -r1.296 init_main.c
--- kern/init_main.c25 Feb 2020 16:55:33 -  1.296
+++ kern/init_main.c12 Mar 2020 23:40:49 -
@@ -639,7 +639,7 @@ start_init(void *arg)
check_console(p);
 
/* process 0 ignores SIGCHLD, but we can't */
-   p->p_p->ps_sigacts->ps_flags = 0;
+   p->p_p->ps_sigacts->ps_sigflags = 0;
 
/*
 * Need just enough stack to hold the faked-up "execve()" arguments.
Index: kern/kern_exit.c
===
RCS file: /cvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.185
diff -u -p -u -p -r1.185 kern_exit.c
--- kern/kern_exit.c1 Mar 2020 18:50:52 -   1.185
+++ kern/kern_exit.c12 Mar 2020 23:40:49 -
@@ -215,7 +215,7 @@ exit1(struct proc *p, int xexit, int xsi
 * If parent has the SAS_NOCLDWAIT flag set, we're not
 * going to become a zombie.
 */
-   if (pr->ps_pptr->ps_sigacts->ps_flags & SAS_NOCLDWAIT)
+   if (pr->ps_pptr->ps_sigacts->ps_sigflags & SAS_NOCLDWAIT)
atomic_setbits_int(>ps_flags, PS_NOZOMBIE);
}
 
Index: kern/kern_sig.c
===
RCS file: /cvs/src/sys/kern/kern_sig.c,v
retrieving revision 1.252
diff -u -p -u -p -r1.252 kern_sig.c
--- kern/kern_sig.c 11 Mar 2020 15:45:03 -  1.252
+++ kern/kern_sig.c 12 Mar 2020 23:40:49 -
@@ -285,9 +285,9 @@ sys_sigaction(struct proc *p, void *v, r
if ((ps->ps_siginfo & bit) != 0)
sa->sa_flags |= SA_SIGINFO;
if (signum == SIGCHLD) {
-   if ((ps->ps_flags & SAS_NOCLDSTOP) != 0)
+   if ((ps->ps_sigflags & SAS_NOCLDSTOP) != 0)
sa->sa_flags |= SA_NOCLDSTOP;
-   if ((ps->ps_flags & SAS_NOCLDWAIT) != 0)
+   if ((ps->ps_sigflags & SAS_NOCLDWAIT) != 0)
sa->sa_flags |= SA_NOCLDWAIT;
}
if ((sa->sa_mask & bit) == 0)
@@ -336,9 +336,9 @@ setsigvec(struct proc *p, int signum, st
ps->ps_catchmask[signum] = sa->sa_mask &~ sigcantmask;
if (signum == SIGCHLD) {
if (sa->sa_flags & SA_NOCLDSTOP)
-   atomic_setbits_int(>ps_flags, SAS_NOCLDSTOP);
+   ps->ps_sigflags |= SAS_NOCLDSTOP;
else
-   atomic_clearbits_int(>ps_flags, SAS_NOCLDSTOP);
+   ps->ps_sigflags &= ~SAS_NOCLDSTOP;
/*
 * If the SA_NOCLDWAIT flag is set or the handler
 * is SIG_IGN we reparent the dying child to PID 1
@@ -350,9 +350,9 @@ setsigvec(struct proc *p, int signum, st
if (initprocess->ps_sigacts != ps &&
((sa->sa_flags & SA_NOCLDWAIT) ||
sa->sa_handler == SIG_IGN))
-   atomic_setbits_int(>ps_flags, SAS_NOCLDWAIT);
+   ps->ps_sigflags |= SAS_NOCLDWAIT;
else
-   atomic_clearbits_int(>ps_flags, SAS_NOCLDWAIT);
+   ps->ps_sigflags &= ~SAS_NOCLDWAIT;
}
if ((sa->sa_flags & SA_RESETHAND) != 0)
ps->ps_sigreset |= bit;
@@ -406,7 +406,7 @@ siginit(struct process *pr)
for (i = 0; i < NSIG; i++)
if (sigprop[i] & SA_IGNORE && i != 

Rename ps_flags -> ps_sigflags in struct sigacts

2020-03-12 Thread Amit Kulkarni
Hi,

In grepping for ps_flags in /sys, it is confusing to see that ps_flags is 
associated with

1) PWM_POLARITY (power regulation?).
Proposed to rename to ps_pwmflags?
2) process signals: struct sigacts in /sys/sys/signalvar.h
3) its rightful usage as ps_flags for struct process in /sys/sys/proc.h


So, to reduce confusion while grepping, the below diff simply renames usages of 
ps_flags in relation to struct sigacts (#2 above) to ps_sigflags.

Thanks

Index: kern/init_main.c
===
RCS file: /cvs/src/sys/kern/init_main.c,v
retrieving revision 1.296
diff -u -p -u -p -r1.296 init_main.c
--- kern/init_main.c25 Feb 2020 16:55:33 -  1.296
+++ kern/init_main.c12 Mar 2020 12:58:41 -
@@ -639,7 +639,7 @@ start_init(void *arg)
check_console(p);
 
/* process 0 ignores SIGCHLD, but we can't */
-   p->p_p->ps_sigacts->ps_flags = 0;
+   p->p_p->ps_sigacts->ps_sigflags = 0;
 
/*
 * Need just enough stack to hold the faked-up "execve()" arguments.
Index: kern/kern_exit.c
===
RCS file: /cvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.185
diff -u -p -u -p -r1.185 kern_exit.c
--- kern/kern_exit.c1 Mar 2020 18:50:52 -   1.185
+++ kern/kern_exit.c12 Mar 2020 12:58:42 -
@@ -215,7 +215,7 @@ exit1(struct proc *p, int xexit, int xsi
 * If parent has the SAS_NOCLDWAIT flag set, we're not
 * going to become a zombie.
 */
-   if (pr->ps_pptr->ps_sigacts->ps_flags & SAS_NOCLDWAIT)
+   if (pr->ps_pptr->ps_sigacts->ps_sigflags & SAS_NOCLDWAIT)
atomic_setbits_int(>ps_flags, PS_NOZOMBIE);
}
 
Index: kern/kern_sig.c
===
RCS file: /cvs/src/sys/kern/kern_sig.c,v
retrieving revision 1.252
diff -u -p -u -p -r1.252 kern_sig.c
--- kern/kern_sig.c 11 Mar 2020 15:45:03 -  1.252
+++ kern/kern_sig.c 12 Mar 2020 12:58:42 -
@@ -285,9 +285,9 @@ sys_sigaction(struct proc *p, void *v, r
if ((ps->ps_siginfo & bit) != 0)
sa->sa_flags |= SA_SIGINFO;
if (signum == SIGCHLD) {
-   if ((ps->ps_flags & SAS_NOCLDSTOP) != 0)
+   if ((ps->ps_sigflags & SAS_NOCLDSTOP) != 0)
sa->sa_flags |= SA_NOCLDSTOP;
-   if ((ps->ps_flags & SAS_NOCLDWAIT) != 0)
+   if ((ps->ps_sigflags & SAS_NOCLDWAIT) != 0)
sa->sa_flags |= SA_NOCLDWAIT;
}
if ((sa->sa_mask & bit) == 0)
@@ -336,9 +336,9 @@ setsigvec(struct proc *p, int signum, st
ps->ps_catchmask[signum] = sa->sa_mask &~ sigcantmask;
if (signum == SIGCHLD) {
if (sa->sa_flags & SA_NOCLDSTOP)
-   atomic_setbits_int(>ps_flags, SAS_NOCLDSTOP);
+   atomic_setbits_int(>ps_sigflags, SAS_NOCLDSTOP);
else
-   atomic_clearbits_int(>ps_flags, SAS_NOCLDSTOP);
+   atomic_clearbits_int(>ps_sigflags, SAS_NOCLDSTOP);
/*
 * If the SA_NOCLDWAIT flag is set or the handler
 * is SIG_IGN we reparent the dying child to PID 1
@@ -350,9 +350,9 @@ setsigvec(struct proc *p, int signum, st
if (initprocess->ps_sigacts != ps &&
((sa->sa_flags & SA_NOCLDWAIT) ||
sa->sa_handler == SIG_IGN))
-   atomic_setbits_int(>ps_flags, SAS_NOCLDWAIT);
+   atomic_setbits_int(>ps_sigflags, SAS_NOCLDWAIT);
else
-   atomic_clearbits_int(>ps_flags, SAS_NOCLDWAIT);
+   atomic_clearbits_int(>ps_sigflags, SAS_NOCLDWAIT);
}
if ((sa->sa_flags & SA_RESETHAND) != 0)
ps->ps_sigreset |= bit;
@@ -406,7 +406,7 @@ siginit(struct process *pr)
for (i = 0; i < NSIG; i++)
if (sigprop[i] & SA_IGNORE && i != SIGCONT)
ps->ps_sigignore |= sigmask(i);
-   ps->ps_flags = SAS_NOCLDWAIT | SAS_NOCLDSTOP;
+   ps->ps_sigflags = SAS_NOCLDWAIT | SAS_NOCLDSTOP;
 }
 
 /*
@@ -442,7 +442,7 @@ execsigs(struct proc *p)
 * Clear set of signals caught on the signal stack.
 */
sigstkinit(>p_sigstk);
-   atomic_clearbits_int(>ps_flags, SAS_NOCLDWAIT);
+   atomic_clearbits_int(>ps_sigflags, SAS_NOCLDWAIT);
if (ps->ps_sigact[SIGCHLD] == SIG_IGN)
ps->ps_sigact[SIGCHLD] = SIG_DFL;
 }
@@ -1360,7 +1360,7 @@ proc_stop_sweep(void *v)
continue;
atomic_clearbits_int(>ps_flags, PS_STOPPED);
 
-   if ((pr->ps_pptr->ps_sigacts->ps_flags & SAS_NOCLDSTOP) == 0)
+   if 

kern_exit.c : 2 tiny buglets + a potential uvm leak

2020-02-29 Thread Amit Kulkarni
Hi,

1) pr->ps_ru is already NULL, so code can be shrunk.
2) missing timeout_del, we clear the 2 timeouts in struct process, but not from 
struct proc.
3) ps_mainproc is allocated in thread_new(), passed to process_new(), and then 
to process_initialize(). ps_mainproc never gets a call to uvm_uarea_free(). So 
to allow ps_mainproc to be freed when called finally in process_zap(), we 
shuffle the uvm_uarea_free() code into proc_free(). Is this analysis correct or 
wrong?

Thanks

Index: kern/kern_exit.c
===
RCS file: /cvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.184
diff -u -p -u -p -r1.184 kern_exit.c
--- kern/kern_exit.c28 Feb 2020 17:03:05 -  1.184
+++ kern/kern_exit.c29 Feb 2020 20:29:16 -
@@ -172,12 +172,7 @@ exit1(struct proc *p, int xexit, int xsi
rup = pr->ps_ru;
if (rup == NULL) {
rup = pool_get(_pool, PR_WAITOK | PR_ZERO);
-   if (pr->ps_ru == NULL) {
-   pr->ps_ru = rup;
-   } else {
-   pool_put(_pool, rup);
-   rup = pr->ps_ru;
-   }
+   pr->ps_ru = rup;
}
p->p_siglist = 0;
if ((p->p_flag & P_THREAD) == 0)
@@ -390,6 +385,14 @@ exit2(struct proc *p)
 void
 proc_free(struct proc *p)
 {
+   timeout_del(>p_sleep_to);
+   /*
+* Free the VM resources we're still holding on to.
+* We must do this from a valid thread because doing
+* so may block.
+*/
+   uvm_uarea_free(p);
+   p->p_vmspace = NULL;/* zap the thread's copy */
crfree(p->p_ucred);
pool_put(_pool, p);
nthreads--;
@@ -422,14 +425,6 @@ reaper(void *arg)
WITNESS_THREAD_EXIT(p);
 
KERNEL_LOCK();
-
-   /*
-* Free the VM resources we're still holding on to.
-* We must do this from a valid thread because doing
-* so may block.
-*/
-   uvm_uarea_free(p);
-   p->p_vmspace = NULL;/* zap the thread's copy */
 
if (p->p_flag & P_THREAD) {
/* Just a thread */



Re: Debugger, traced processes & exit status

2020-02-29 Thread Amit Kulkarni
Hi,

> @@ -303,6 +312,7 @@ struct process {
>  #define  PS_PLEDGE   0x0010  /* Has called pledge(2) */
>  #define  PS_WXNEEDED 0x0020  /* Process may violate W^X */
>  #define  PS_EXECPLEDGE   0x0040  /* Has exec pledges */
> +#define  PS_ORPHAN   0x0080  /* Process is on an orphan list 
> */
>  
>  #define  PS_BITS \
>  ("\20" "\01CONTROLT" "\02EXEC" "\03INEXEC" "\04EXITING" "\05SUGID" \

PS_ORPHAN entry needs to be added to PS_BITS.

Off topic to this diff: kern_exit.c is a weird mix of ISSET() and boolean 
comparison with 1) flag_field & FLAG 2) == 0. Is it ok to send a follow-up diff 
with ISSET() or !ISSET() as applicable throughout this file, so it reads easier?

Thanks



Part 2: reduce SCHED_LOCKing while a proc is running

2020-02-17 Thread Amit Kulkarni
Hi,

This is part 2 of previous diff sent earlier and includes it also: "sync 
quantum for a proc with its per-CPU tracking".

While a proc is running, try not to disturb its running. Only change proc 
priority just before adding to runqueue. This reduces SCHED_LOCK occurences in 
kernel by a tiny bit: between 1.5 - 2% in speedup in testing on a kernel bsd.mp 
build. If a proc runs for less than 3 ticks, then its priority is left 
undisturbed. Otherwise it is adjusted just like in schedclock() in the 
equivalent function decay_prio_ifrun().

In this diff, I made changes to non-AMD64 arch without having any access to it, 
I believe it should compile and run.

Thanks

Index: arch/alpha/alpha/interrupt.c
===
RCS file: /cvs/src/sys/arch/alpha/alpha/interrupt.c,v
retrieving revision 1.40
diff -u -p -u -p -r1.40 interrupt.c
--- arch/alpha/alpha/interrupt.c21 Jan 2017 05:42:03 -  1.40
+++ arch/alpha/alpha/interrupt.c13 Feb 2020 01:10:03 -
@@ -234,14 +234,6 @@ interrupt(unsigned long a0, unsigned lon
 * will also deal with time-of-day stuff.
 */
(*platform.clockintr)((struct clockframe *)framep);
-
-   /*
-* If it's time to call the scheduler clock,
-* do so.
-*/
-   if ((++ci->ci_schedstate.spc_schedticks & 0x3f) == 0 &&
-   schedhz != 0)
-   schedclock(ci->ci_curproc);
}
break;
 
Index: arch/sparc64/sparc64/clock.c
===
RCS file: /cvs/src/sys/arch/sparc64/sparc64/clock.c,v
retrieving revision 1.59
diff -u -p -u -p -r1.59 clock.c
--- arch/sparc64/sparc64/clock.c30 Apr 2017 16:45:45 -  1.59
+++ arch/sparc64/sparc64/clock.c13 Feb 2020 01:10:04 -
@@ -879,8 +879,6 @@ int
 schedintr(arg)
void *arg;
 {
-   if (curproc)
-   schedclock(curproc);
return (1);
 }
 
Index: kern/kern_clock.c
===
RCS file: /cvs/src/sys/kern/kern_clock.c,v
retrieving revision 1.101
diff -u -p -u -p -r1.101 kern_clock.c
--- kern/kern_clock.c   21 Jan 2020 16:16:23 -  1.101
+++ kern/kern_clock.c   13 Feb 2020 01:10:04 -
@@ -397,14 +397,6 @@ statclock(struct clockframe *frame)
 
if (p != NULL) {
p->p_cpticks++;
-   /*
-* If no schedclock is provided, call it here at ~~12-25 Hz;
-* ~~16 Hz is best
-*/
-   if (schedhz == 0) {
-   if ((++spc->spc_schedticks & 3) == 0)
-   schedclock(p);
-   }
}
 }
 
Index: kern/sched_bsd.c
===
RCS file: /cvs/src/sys/kern/sched_bsd.c,v
retrieving revision 1.62
diff -u -p -u -p -r1.62 sched_bsd.c
--- kern/sched_bsd.c30 Jan 2020 08:51:27 -  1.62
+++ kern/sched_bsd.c13 Feb 2020 01:10:04 -
@@ -62,6 +62,7 @@ int   rrticks_init;   /* # of hardclock tic
 struct __mp_lock sched_lock;
 #endif
 
+void   decay_prio_ifrun(struct proc*);
 void   schedcpu(void *);
 uint32_t   decay_aftersleep(uint32_t, uint32_t);
 
@@ -90,21 +91,13 @@ roundrobin(struct cpu_info *ci)
 {
struct schedstate_percpu *spc = >ci_schedstate;
 
-   spc->spc_rrticks = rrticks_init;
-
if (ci->ci_curproc != NULL) {
-   if (spc->spc_schedflags & SPCF_SEENRR) {
-   /*
-* The process has already been through a roundrobin
-* without switching and may be hogging the CPU.
-* Indicate that the process should yield.
-*/
-   atomic_setbits_int(>spc_schedflags,
-   SPCF_SHOULDYIELD);
-   } else {
-   atomic_setbits_int(>spc_schedflags,
-   SPCF_SEENRR);
-   }
+   /*
+* The thread has now completed its full time quantum
+* without being moved off the CPU and may be hogging the CPU.
+* Indicate that the process should yield.
+*/
+   atomic_setbits_int(>spc_schedflags, SPCF_SHOULDYIELD);
}
 
if (spc->spc_nrun)
@@ -291,6 +284,27 @@ decay_aftersleep(uint32_t estcpu, uint32
return (newcpu);
 }
 
+/* Adjust priority depending on how much curproc actually ran */
+void
+decay_prio_ifrun(struct proc *p)
+{
+   struct schedstate_percpu *spc = >p_cpu->ci_schedstate;
+   if (p != spc->spc_idleproc) {
+   int j = (rrticks_init - spc->spc_rrticks)/3;
+
+ 

sync quantum for a proc with its per-CPU tracking

2020-02-17 Thread Amit Kulkarni
Hi,

This diff makes sure that a proc and its per-CPU structure which tracks when to 
schedule another proc in round-robin are in sync. No observable change in time 
spent compiling the bsd.mp kernel.

This diff in addition with another which replaces schedclock() with equivalent 
code, shaves approx 5 secs in real time from a bsd.mp kernel build.

Thanks

Index: kern/sched_bsd.c
===
RCS file: /cvs/src/sys/kern/sched_bsd.c,v
retrieving revision 1.62
diff -u -p -u -p -r1.62 sched_bsd.c
--- kern/sched_bsd.c30 Jan 2020 08:51:27 -  1.62
+++ kern/sched_bsd.c11 Feb 2020 14:00:17 -
@@ -90,21 +90,13 @@ roundrobin(struct cpu_info *ci)
 {
struct schedstate_percpu *spc = >ci_schedstate;
 
-   spc->spc_rrticks = rrticks_init;
-
if (ci->ci_curproc != NULL) {
-   if (spc->spc_schedflags & SPCF_SEENRR) {
-   /*
-* The process has already been through a roundrobin
-* without switching and may be hogging the CPU.
-* Indicate that the process should yield.
-*/
-   atomic_setbits_int(>spc_schedflags,
-   SPCF_SHOULDYIELD);
-   } else {
-   atomic_setbits_int(>spc_schedflags,
-   SPCF_SEENRR);
-   }
+   /*
+* The process is now completing a roundrobin
+* without switching off the CPU and may be hogging the CPU.
+* Indicate that the process should yield.
+*/
+   atomic_setbits_int(>spc_schedflags, SPCF_SHOULDYIELD);
}
 
if (spc->spc_nrun)
@@ -384,6 +376,16 @@ mi_switch(void)
 * scheduling flags.
 */
atomic_clearbits_int(>spc_schedflags, SPCF_SWITCHCLEAR);
+
+   /* 
+* We start afresh here, sync the proc and the per-cpu state
+* to match exactly on how much time to allow the proc to run.
+* This gives a chance to a proc to get its full quantum, and
+* not worry if there is a chance to have it taken off the CPU
+* at way less than its alloted quantum or have another proc
+* take way more than its alloted quantum.
+*/
+   spc->spc_rrticks = rrticks_init;
 
nextproc = sched_chooseproc();
 
Index: sys/sched.h
===
RCS file: /cvs/src/sys/sys/sched.h,v
retrieving revision 1.56
diff -u -p -u -p -r1.56 sched.h
--- sys/sched.h 21 Oct 2019 10:24:01 -  1.56
+++ sys/sched.h 11 Feb 2020 14:00:17 -
@@ -131,9 +131,8 @@ struct cpustats {
 #ifdef _KERNEL
 
 /* spc_flags */
-#define SPCF_SEENRR 0x0001  /* process has seen roundrobin() */
 #define SPCF_SHOULDYIELD0x0002  /* process should yield the CPU */
-#define SPCF_SWITCHCLEAR(SPCF_SEENRR|SPCF_SHOULDYIELD)
+#define SPCF_SWITCHCLEAR(SPCF_SHOULDYIELD)
 #define SPCF_SHOULDHALT0x0004  /* CPU should be vacated */
 #define SPCF_HALTED0x0008  /* CPU has been halted */
 



Re: sched_steal_proc(): steal from highest loaded CPU

2020-02-05 Thread Amit Kulkarni
> > 'ci' changes after we proceed to another cpu, and cost will be different 
> > for each cpu+proc combination.
> 
> Look at at line 516 of kern/kern_sched.c doest the first argument of
> sched_proc_to_cpu_cost() change?

Aaaargh, you are absolutely right! My fault, I was blinded by looking at the 
sched_proc_to_cpu_cost() in sched_choosecpu(), where the ci changes while 
iterating on the cpus, and mentally transferred that to sched_steal_proc().


> > Actually, in the steal case, if there is a large difference in 
> > spc_curpriority and p_usrpri between running proc which is SONPROC vs the 
> > current proc, then the cost will be skewed towards an outlier, but this 
> > will happen for all procs for that ci as you analyzed.
> 
> You missed the point above, the CPU is trying to steal a thread because
> it has nothing to execute, so no SONPROC.
> 
> > To summarize: currently, if a cpu is running 1-2 procs with large 
> > differences in priority, there is a high possibility that this function 
> > will go steal a proc from it for the just idled cpu. Whereas at the same 
> > time, another cpu is running 5-7 procs with exact same priorities, that 
> > will be ignored in the above calculation! We should have stolen from the 
> > cpu with 5-7 procs.
> 
> Is it speculation based on your understanding of what you read or can
> you reproduce this?
> 
> > I hope this is a correct interpretation? Is there something wrong in this 
> > analysis?
> 
> Why do you ask?  Are you interested in changing the actual code?  If so
> why?  Are you interested in understanding what does it do?
> 
> > Another small optimization is [...]
> 
> How do you know it's an optimization?  How did you test it?  What is
> your workload?  How did you prove your reasoning?
> 
> I don't know why you're sending such emails, but if your goal is to
> contribute may I suggest you to pick a simpler place to start and to
> really understand it before sending diffs?  Maybe test your changes,
> gather data, look at facts instead of guesses... :o)

My apologies, I will base it on facts and testing. :(

Thanks



Re: sched_steal_proc(): steal from highest loaded CPU

2020-02-04 Thread Amit Kulkarni
> > When a cpu is idle, and wants to steal, it should steal from worst loaded 
> > cpu, i.e. with the highest cost, not the least cost.
> 
> What you say might be a valid choice.  However I'm not sure to
> understand how does it relate to the behavior of the code?
> 
> When stealing sched_proc_to_cpu_cost() is always called with for the
> same CPU, `ci' is always the same, which means `cost' is equivalent to
> `p_usrpri', right?
> 
> Now the priority space is defined in sys/param.h and the higher priority
> correspond to a small number.  So are we sure we want to steal a thread
> with the highest `cost', which would mean with the lowest priority?  Or
> did I miss something?

When a cpu is idle, it is looking for a proc to steal. Now these two loops 
(outer queued procs cpu loop + inner runqueue per cpu loop) basically loops 
over all procs in the first readily runnable runqueue per each cpu. We should 
ideally steal from the most loaded CPU, denoted by cost. Cost takes into 
account: priority, runqueue length, primary cpu check, load average, and 
finally memory footprint. Here, the most dominant variable should be the 
runqueue length, since it is a multiplication with sched_cost_runnable. 

'ci' changes after we proceed to another cpu, and cost will be different for 
each cpu+proc combination. Actually, in the steal case, if there is a large 
difference in spc_curpriority and p_usrpri between running proc which is 
SONPROC vs the current proc, then the cost will be skewed towards an outlier, 
but this will happen for all procs for that ci as you analyzed.

To summarize: currently, if a cpu is running 1-2 procs with large differences 
in priority, there is a high possibility that this function will go steal a 
proc from it for the just idled cpu. Whereas at the same time, another cpu is 
running 5-7 procs with exact same priorities, that will be ignored in the above 
calculation! We should have stolen from the cpu with 5-7 procs.

I hope this is a correct interpretation? Is there something wrong in this 
analysis?

Another small optimization is to issue a break right after the if(cost > 
bestcost) check, from the TAILQ_FOREACH, so we short circuit the potential 
exponential lookup (i * j). Currently, this is O(N * N )! With that 
optimization in place, we now looked at this cpu, and maybe we consider keeping 
this unpegged proc in mind, for our stealing calculation. Let's now go look at 
just 1 proc in other cpu's. Why iterate all procs at the head of runqueue of 
each cpu (worst case), one proc is enough per cpu in this stealing calculation, 
is it not? Once we decide which is the highest cost, we send it to the idle 
cpu. Basically, we try to shorten this O(N * N) lookup. My guess, for machines 
with large ncpu, this will be really noticeable slowness, in sched_steal_proc().

Thanks

> 
> > Index: kern/kern_sched.c
> > ===
> > RCS file: /cvs/src/sys/kern/kern_sched.c,v
> > retrieving revision 1.64
> > diff -u -p -u -p -r1.64 kern_sched.c
> > --- kern/kern_sched.c   30 Jan 2020 08:51:27 -  1.64
> > +++ kern/kern_sched.c   4 Feb 2020 14:25:59 -
> > @@ -487,7 +487,7 @@ sched_steal_proc(struct cpu_info *self)
> > struct proc *best = NULL;
> >  #ifdef MULTIPROCESSOR
> > struct schedstate_percpu *spc;
> > -   int bestcost = INT_MAX;
> > +   int bestcost = 0;
> > struct cpu_info *ci;
> > struct cpuset set;
> >  
> > @@ -515,7 +515,7 @@ sched_steal_proc(struct cpu_info *self)
> >  
> > cost = sched_proc_to_cpu_cost(self, p);
> >  
> > -   if (best == NULL || cost < bestcost) {
> > +   if (cost > bestcost) {
> > best = p;
> > bestcost = cost;
> > }

a break statement right here.

> > @@ -524,7 +524,6 @@ sched_steal_proc(struct cpu_info *self)
> > if (best == NULL)
> > return (NULL);
> >  
> > -   spc = >p_cpu->ci_schedstate;
> > remrunqueue(best);
> > best->p_cpu = self;
> >  
> > 


-- 
Amit Kulkarni 



sched_steal_proc(): steal from highest loaded CPU

2020-02-04 Thread Amit Kulkarni
Hi,

When a cpu is idle, and wants to steal, it should steal from worst loaded cpu, 
i.e. with the highest cost, not the least cost.
Also, remove a dead store while here.

Thanks

Index: kern/kern_sched.c
===
RCS file: /cvs/src/sys/kern/kern_sched.c,v
retrieving revision 1.64
diff -u -p -u -p -r1.64 kern_sched.c
--- kern/kern_sched.c   30 Jan 2020 08:51:27 -  1.64
+++ kern/kern_sched.c   4 Feb 2020 14:25:59 -
@@ -487,7 +487,7 @@ sched_steal_proc(struct cpu_info *self)
struct proc *best = NULL;
 #ifdef MULTIPROCESSOR
struct schedstate_percpu *spc;
-   int bestcost = INT_MAX;
+   int bestcost = 0;
struct cpu_info *ci;
struct cpuset set;
 
@@ -515,7 +515,7 @@ sched_steal_proc(struct cpu_info *self)
 
cost = sched_proc_to_cpu_cost(self, p);
 
-   if (best == NULL || cost < bestcost) {
+   if (cost > bestcost) {
best = p;
bestcost = cost;
}
@@ -524,7 +524,6 @@ sched_steal_proc(struct cpu_info *self)
if (best == NULL)
return (NULL);
 
-   spc = >p_cpu->ci_schedstate;
remrunqueue(best);
best->p_cpu = self;
 



SMR question

2019-12-29 Thread Amit Kulkarni
Hi,
Is the purpose of smr_grace_wait() to round robin the curproc through
all CPUs, to make sure all CPUs have passed the quiescent state? Is
this line of thinking correct or flawed?

I was preparing a diff and wanted to know if it is safe to disable the
SMR kthread by commenting it out for testing purposes on amd64. Just a
simple yes/no would do!

Thanks

/*
 * Block until all CPUs have crossed quiescent state.
 */
void
smr_grace_wait(void)
{
#ifdef MULTIPROCESSOR
CPU_INFO_ITERATOR cii;
struct cpu_info *ci, *ci_start;

ci_start = curcpu();
CPU_INFO_FOREACH(cii, ci) {
if (ci == ci_start)
continue;
sched_peg_curproc(ci);
}
atomic_clearbits_int(>p_flag, P_CPUPEG);
#endif /* MULTIPROCESSOR */
}



Re: Pump my sched: fewer SCHED_LOCK() & kill p_priority

2019-06-27 Thread Amit Kulkarni
> root on sd2a (88532b67c09ce3ee.a) swap on sd2b dump on sd2b
> TSC skew=-6129185140 drift=170
> TSC skew=-6129184900 drift=-10
> TSC skew=-6129184890 drift=-20
> TSC skew=-6129184910 drift=30
> TSC skew=-6129184910 drift=10
> TSC skew=-6129184900 drift=20
> TSC skew=-6129184910 drift=30
> iwm0: hw rev 0x230, fw ver 22.361476.0, address 68:ec:c5:ad:9a:cb
> initializing kernel modesetting (RAVEN 0x1002:0x15DD 0x17AA:0x506F 0xC4).
> amdgpu0: 1920x1080, 32bpp
> wsdisplay0 at amdgpu0 mux 1: console (std, vt100 emulation), using wskbd0
> wsdisplay0: screen 1-5 added (std, vt100 emulation)
>

It seems that you have Paul's TSC patch also applied. Please apply
just one patch and test separately, and then report back!

Thanks



Re: Pump my sched: fewer SCHED_LOCK() & kill p_priority

2019-06-23 Thread Amit Kulkarni
On Fri, 21 Jun 2019 21:54:18 -0700
Mike Larkin  wrote:

> On Fri, Jun 21, 2019 at 05:11:26PM -0300, Martin Pieuchot wrote:
> > On 06/06/19(Thu) 15:16, Martin Pieuchot wrote:
> > > On 02/06/19(Sun) 16:41, Martin Pieuchot wrote:
> > > > On 01/06/19(Sat) 18:55, Martin Pieuchot wrote:
> > > > > Diff below exists mainly for documentation and test purposes.  If
> > > > > you're not interested about how to break the scheduler internals in
> > > > > pieces, don't read further and go straight to testing!
> > > > > 
> > > > > - First change is to stop calling tsleep(9) at PUSER.  That makes
> > > > >   it clear that all "sleeping priorities" are smaller than PUSER.
> > > > >   That's important to understand for the diff below.  `p_priority'
> > > > >   is currently a placeholder for the "sleeping priority" and the
> > > > >   "runnqueue priority".  Both fields are separated by this diff.
> > > > > 
> > > > > - When a thread goes to sleep, the priority argument of tsleep(9) is
> > > > >   now recorded in `p_slpprio'.  This argument can be considered as 
> > > > > part
> > > > >   of the sleep queue.  Its purpose is to place the thread into a 
> > > > > higher
> > > > >   runqueue when awoken.
> > > > > 
> > > > > - Currently, for stopped threads, `p_priority' correspond to 
> > > > > `p_usrpri'. 
> > > > >   So setrunnable() has been untangled to place SSTOP and SSLEEP 
> > > > > threads
> > > > >   in the preferred queue without having to use `p_priority'.  Note 
> > > > > that
> > > > >   `p_usrpri' is still recalculated *after* having called 
> > > > > setrunqueue().
> > > > >   This is currently fine because setrunnable() is called with 
> > > > > SCHED_LOCK() 
> > > > >   but it will be racy when we'll split it.
> > > > > 
> > > > > - A new field, `p_runprio' has been introduced.  It should be 
> > > > > considered
> > > > >   as part of the per-CPU runqueues.  It indicates where a current 
> > > > > thread
> > > > >   is placed.
> > > > > 
> > > > > - `spc_curpriority' is now updated at every context-switch.  That 
> > > > > means
> > > > >need_resched() won't be called after comparing an out-of-date 
> > > > > value.
> > > > >At the same time, `p_usrpri' is initialized to the highest possible
> > > > >value for idle threads.
> > > > > 
> > > > > - resched_proc() was calling need_resched() in the following 
> > > > > conditions:
> > > > >- If the SONPROC thread has a higher priority that the current
> > > > >  running thread (itself).
> > > > >- Twice in setrunnable() when we know that p_priority <= p_usrpri.
> > > > >- If schedcpu() considered that a thread, after updating its prio,
> > > > >  should preempt the one running on the CPU pointed by `p_cpu'. 
> > > > > 
> > > > >   The diff below simplify all of that by calling need_resched() when:
> > > > >- A thread is inserted in a CPU runqueue at a higher priority than
> > > > >  the one SONPROC.
> > > > >- schedcpu() decides that a thread in SRUN state should preempt the
> > > > >  one SONPROC.
> > > > > 
> > > > > - `p_estcpu' `p_usrpri' and `p_slptime' which represent the "priority"
> > > > >   of a thread are now updated while holding a per-thread mutex.  As a
> > > > >   result schedclock() and donice() no longer takes the SCHED_LOCK(),
> > > > >   and schedcpu() almost never take it.
> > > > > 
> > > > > - With this diff top(1) and ps(1) will report the "real" `p_usrpi' 
> > > > > value
> > > > >   when displaying priorities.  This is helpful to understand what's
> > > > >   happening:
> > > > > 
> > > > > load averages:  0.99,  0.56,  0.25   
> > > > > two.lab.grenadille.net 23:42:10
> > > > > 70 threads: 68 idle, 2 on processor   
> > > > >  up  0:09
> > > > > CPU0:  0.0% user,  0.0% nice, 51.0% sys,  2.0% spin,  0.0% intr, 
> > > > > 47.1% idle
> > > > > CPU1:  2.0% user,  0.0% nice, 51.0% sys,  3.9% spin,  0.0% intr, 
> > > > > 43.1% idle
> > > > > Memory: Real: 47M/1005M act/tot Free: 2937M Cache: 812M Swap: 0K/4323M
> > > > > 
> > > > >   PID  TID PRI NICE  SIZE   RES STATE WAIT  TIMECPU 
> > > > > COMMAND
> > > > > 81000   145101  7200K 1664K sleep/1   bored 1:15 36.96% 
> > > > > softnet
> > > > > 47133   244097  730 2984K 4408K sleep/1   netio 1:06 35.06% 
> > > > > cvs 
> > > > > 64749   522184  660  176K  148K onproc/1  - 0:55 28.81% 
> > > > > nfsd
> > > > > 21615   602473 12700K 1664K sleep/0   - 7:22  0.00% 
> > > > > idle0  
> > > > > 12413   606242 12700K 1664K sleep/1   - 7:08  0.00% 
> > > > > idle1
> > > > > 85778   338258  500 4936K 7308K idle  select0:10  0.00% 
> > > > > ssh  
> > > > > 22771   575513  500  176K  148K sleep/0   nfsd  0:02  0.00% 
> > > > > nfsd 
> > > > > 
> > > > > 
> > > > > 
> > > > > - The removal of `p_priority' and the change that makes mi_switch()
> > > > >   always update `spc_curpriority' might introduce some 

Re: Pump my sched: fewer SCHED_LOCK() & kill p_priority

2019-06-09 Thread Amit Kulkarni
On Sun, Jun 9, 2019 at 10:39 PM Theo de Raadt  wrote:
>
> Amit Kulkarni  wrote:
>
> > > Index: sys/sysctl.h
> > > ===
> > > RCS file: /cvs/src/sys/sys/sysctl.h,v
> > > retrieving revision 1.188
> > > diff -u -p -r1.188 sysctl.h
> > > --- sys/sysctl.h1 Jun 2019 14:11:18 -   1.188
> > > +++ sys/sysctl.h1 Jun 2019 16:36:13 -
> > > @@ -629,7 +629,7 @@ do {  
> > >   \
> > > (kp)->p_stat = (p)->p_stat; \
> > > (kp)->p_slptime = (p)->p_slptime;   \
> > > (kp)->p_holdcnt = 1;\
> > > -   (kp)->p_priority = (p)->p_priority; \
> > > +   (kp)->p_priority = (p)->p_usrpri + PZERO;   \
> > > (kp)->p_usrpri = (p)->p_usrpri; \
> > > if ((p)->p_wchan && (p)->p_wmesg)   \
> > > copy_str((kp)->p_wmesg, (p)->p_wmesg,   \
> > >
> >
> >
> > Hi,
> >
> > A request, to remove the +PZERO here above and the -PZERO in 
> > /usr/src/usr.bin/top/machine.c, why do an unnecessary calculation twice, 
> > once to set and other time to this?
>
> This is getting out of hand.
>
> Have you reviewed the *entire universe* of software to ensure that
> top is the only program which looks at this?
>
> No.  You have not.  So please stop proposing changes where you aren't
> willing to invest into studying the history.

Got it. Sorry, my fault here.



Re: Pump my sched: fewer SCHED_LOCK() & kill p_priority

2019-06-09 Thread Amit Kulkarni
> Index: sys/sysctl.h
> ===
> RCS file: /cvs/src/sys/sys/sysctl.h,v
> retrieving revision 1.188
> diff -u -p -r1.188 sysctl.h
> --- sys/sysctl.h  1 Jun 2019 14:11:18 -   1.188
> +++ sys/sysctl.h  1 Jun 2019 16:36:13 -
> @@ -629,7 +629,7 @@ do {  
> \
>   (kp)->p_stat = (p)->p_stat; \
>   (kp)->p_slptime = (p)->p_slptime;   \
>   (kp)->p_holdcnt = 1;\
> - (kp)->p_priority = (p)->p_priority; \
> + (kp)->p_priority = (p)->p_usrpri + PZERO;   \
>   (kp)->p_usrpri = (p)->p_usrpri; \
>   if ((p)->p_wchan && (p)->p_wmesg)   \
>   copy_str((kp)->p_wmesg, (p)->p_wmesg,   \
> 


Hi,

A request, to remove the +PZERO here above and the -PZERO in 
/usr/src/usr.bin/top/machine.c, why do an unnecessary calculation twice, once 
to set and other time to unset?

Thanks



Re: Pump my sched: fewer SCHED_LOCK() & kill p_priority

2019-06-03 Thread Amit Kulkarni
Hi,

This is pretty cool diff in splitting the sleep prio and the run prio!

In a few places, the documentation comment could be changed from process to 
proc, tried to find it below and mark. It leaves reader confused for a moment.

thanks

> > - `spc_curpriority' is now updated at every context-switch.  That means
> >need_resched() won't be called after comparing an out-of-date value.
> >At the same time, `p_usrpri' is initialized to the highest possible
> >value for idle threads.
> > - resched_proc() was calling need_resched() in the following conditions:
> >- If the SONPROC thread has a higher priority that the current
> >  running thread (itself).
> >- Twice in setrunnable() when we know that p_priority <= p_usrpri.
> >- If schedcpu() considered that a thread, after updating its prio,
> >  should preempt the one running on the CPU pointed by `p_cpu'. 
> > 
> >   The diff below simplify all of that by calling need_resched() when:
> >- A thread is inserted in a CPU runqueue at a higher priority than
> >  the one SONPROC.
> >- schedcpu() decides that a thread in SRUN state should preempt the
> >  one SONPROC.

Just FYI, this should fix a serious bug, the resched_proc() call was very wrong 
in comparing stale priority in deciding what to schedule, and it made a pretty 
bad decision consistently!

> > - `p_estcpu' `p_usrpri' and `p_slptime' which represent the "priority"
> >   of a thread are now updated while holding a per-thread mutex.  As a
> >   result schedclock() and donice() no longer takes the SCHED_LOCK(),
> >   and schedcpu() almost never take it.

You forgot to add resetpriority() which is also moved from SCHED_LOCK!

> > 
> > - With this diff top(1) and ps(1) will report the "real" `p_usrpi' value
> >   when displaying priorities.  This is helpful to understand what's
> >   happening:
> > 
> > load averages:  0.99,  0.56,  0.25   two.lab.grenadille.net 
> > 23:42:10
> > 70 threads: 68 idle, 2 on processorup  
> > 0:09
> > CPU0:  0.0% user,  0.0% nice, 51.0% sys,  2.0% spin,  0.0% intr, 47.1% idle
> > CPU1:  2.0% user,  0.0% nice, 51.0% sys,  3.9% spin,  0.0% intr, 43.1% idle
> > Memory: Real: 47M/1005M act/tot Free: 2937M Cache: 812M Swap: 0K/4323M
> > 
> >   PID  TID PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
> > 81000   145101  7200K 1664K sleep/1   bored 1:15 36.96% softnet
> > 47133   244097  730 2984K 4408K sleep/1   netio 1:06 35.06% cvs 
> > 64749   522184  660  176K  148K onproc/1  - 0:55 28.81% nfsd
> > 21615   602473 12700K 1664K sleep/0   - 7:22  0.00% idle0  
> > 12413   606242 12700K 1664K sleep/1   - 7:08  0.00% idle1
> > 85778   338258  500 4936K 7308K idle  select0:10  0.00% ssh  
> > 22771   575513  500  176K  148K sleep/0   nfsd  0:02  0.00% nfsd 
> > 
> > 
> > 
> > - The removal of `p_priority' and the change that makes mi_switch()
> >   always update `spc_curpriority' might introduce some changes in
> >   behavior, especially with kernel threads that were not going through
> >   tsleep(9).  We currently have some situations where the priority of
> >   the running thread isn't correctly reflected.  This diff changes that
> >   which means we should be able to better understand where the problems
> >   are.
> > 
> > I'd be interested in comments/tests/reviews before continuing in this
> > direction.  Note that at least part of this diff are required to split
> > the accounting apart from the SCHED_LOCK() as well.
> > 
> > I'll also work on exporting scheduler statistics unless somebody wants
> > to beat me :)
> 
> Updated diff to use IPL_SCHED and rebased to apply on top of -current :) 
> 
> Index: arch/amd64/amd64/genassym.cf
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/genassym.cf,v
> retrieving revision 1.40
> diff -u -p -r1.40 genassym.cf
> --- arch/amd64/amd64/genassym.cf  17 May 2019 19:07:15 -  1.40
> +++ arch/amd64/amd64/genassym.cf  1 Jun 2019 16:27:46 -
> @@ -32,7 +32,6 @@ export  VM_MIN_KERNEL_ADDRESS
>  
>  struct   proc
>  member   p_addr
> -member   p_priority
>  member   p_stat
>  member   p_wchan
>  member   P_MD_REGS   p_md.md_regs
> Index: arch/hppa/hppa/genassym.cf
> ===
> RCS file: /cvs/src/sys/arch/hppa/hppa/genassym.cf,v
> retrieving revision 1.47
> diff -u -p -r1.47 genassym.cf
> --- arch/hppa/hppa/genassym.cf9 Feb 2015 08:20:13 -   1.47
> +++ arch/hppa/hppa/genassym.cf1 Jun 2019 17:21:44 -
> @@ -130,7 +130,6 @@ membertf_cr30
>  # proc fields and values
>  struct   proc
>  member   p_addr
> -member   p_priority
>  member   p_stat
>  member   p_wchan
>  member   p_md
> Index: arch/i386/i386/esm.c
> 

Re: scheduler small changes

2019-06-01 Thread Amit Kulkarni
> > > The only reason I added quantum, was that I stumbled on the round robin 
> > > interval buglet. Initially added a fixed 100 ms per proc, and then 
> > > decided how much I could explore this quantum idea while still trying to 
> > > keep the code understandable.
> > 
> > Which buglet?  Should we fix it?
> 
> A minimal diff for the round robin interval buglet is attached at the end of 
> this email, doesn't use the SCHED_LOCK(). Note, I removed all the different 
> quantums, just kept it at the default 100 ms, to try to convey and determine 
> if it is my misunderstanding, and there is no buglet. I still haven't looked 
> deeply at implementing the sysctl for a separate systat view. Also, your 
> diffs to minimize SCHED_LOCK() contention, makes me think it is not right 
> time to introduce variable rr interval scheduling, without trying to reduce 
> the simple cases of SCHED_LOCK() contention first.


Aand, I forgot to give an explanation in the email about this diff. 
Currently, because of the 100 ms round-robin, we don't know when a proc 
actually started, it might have started 10 ms or 20 ms ago, and suddenly it 
could be moved away due to the rrticks being hit every 100 ms, that rr interval 
clock and the proc start time is never guaranteed to be in sync. We try to give 
a proc its full quantum of 100 ms, once it completes its 100 ms quantum, then 
it signals yield, and lets other proc's run. The other kernel mechanisms are 
not touched at all. So a running proc could get preempted via sched_pause(), 
and everytime any proc runs, its quantum is reset back to 10 rrticks_init. In 
best case, each proc gets 100 ms of running time, which is an improvement over 
the current behavoir. In the worst case, a proc could get preempted/yield to 
higher priority proc, but this worst case behaviour is the same before and 
after this diff.

Thanks



Re: scheduler small changes

2019-05-31 Thread Amit Kulkarni
Hi,

Sorry for slacking off earlier, I was trying to recharge myself with some time 
off without looking at kernel code, and come back with a renewed focus.

> > > Regarding the choice of deriving quantum from the priority, are you sure
> > > the priorities are correct?  Should we keep priorities?  Or if userland
> > > needs priorities shouldn't we convert quantum into priority and not the
> > > other way around?
> > 
> > I am not entirely sure of the p_priority/usrpri/estcpu/load_avg 
> > calculations, as I am still trying to make sense of the code. But once we 
> > make sure all the p_priority calculations are consistent, I think the 
> > priorities are the way to go.
> 
> Why?  What's your goal?  In other words what kind of scheduling policy you
> want?  When should another thread be selected?  What about preemption? 

We should continue polishing current scheduling policy, maybe simplify it a 
bit. I can't answer this question right now, as I am still trying to understand 
the current design and the issues it is having.

> > If we go by deadlines, we will not have a way to understand how a proc is 
> > behaving in real time
> 
> What do you mean by behaving?  How much time did it spend running?  On
> which CPU?  Can't we monitor that?

To clarify and make sure what is a deadline: a deadline is a monotonically 
increasing counter, a single increment indicates 10 ms slice is consumed. I was 
not sure how that would tell us about the recent past i.e. within the last 1 
sec, whether it had used any CPU. I am not sure how we can monitor a p_deadline 
unless we add some other proc variables into the calculation. Like how we use 
p_estcpu or p_cpticks to derive p_priority.

I recently read up on Linux CFS here on this page: 
https://doc.opensuse.org/documentation/leap/tuning/html/book.sle.tuning/cha.tuning.taskscheduler.html
 One way to do the scheduling with a p_deadline could be from Section 13.3.1 
How CFS Works:

"When CFS schedules a task it accumulates “virtual runtime” or vruntime. The 
next task picked to run is always the task with the minimum accumulated 
vruntime so far. By balancing the red-black tree when tasks are inserted into 
the run queue (a planned time line of processes to be executed next), the task 
with the minimum vruntime is always the first entry in the red-black tree.

The amount of vruntime a task accrues is related to its priority. High priority 
tasks gain vruntime at a slower rate than low priority tasks, which results in 
high priority tasks being picked to run on the processor more often."

But again, this vruntime (deadline as used by Gregor/Michal) is used with 
priority. I think we cannot escape priority, as we support nice'ing a proc. 
Maybe there is a way to do scheduling decision without priority? At this time, 
it is beyond my understanding.

> > as a p_deadline is a history variable, how much a proc used its time 
> > slices. But if we stay with priorities, it is way simpler, and ranges 
> > strictly from 0 -> 127, and will work with current code. So the code 
> > changes will be minimal and easy to understand. Also, UNIX has historically 
> > had a concept of priority/nice levels, so I think we should stick with it.
> 
> So it's because it is simpler?  But if we follow this road, not changing
> anything is also simpler.  So why do you want to change the current
> code? :)

You got me there! :) We are all trying to improve, hopefully we keep it simple!

> > IMHO, a p_deadline field is a better substitute for p_slptime.
> 
> Why?

p_slptime is really only used for 
1) starting the swapper after a 10 sec boot delay
2) in choosing the parent process to swap out, which has a proc which is 
sleeping the most.
3) in setting the p_priority when waking up from sleep in sched_bsd.

For #1, we keep a 10 sec boot delay separately, for #2 we can substitute with a 
p_deadline, which is incremented right where p_cpticks is incremented, and we 
choose a proc with least amount of p_deadline in uvm_glue.c. For #3, in 
kern_synch, we can set the p_priority at the correct place(s) when a proc is 
coming back from SSLEEP, and in schedcpu() of sched_bsd.c we can update 
p_priority only for proc's with either a SONPROC or SRUN (reduced SCHED_LOCK() 
contention?), so p_slptime and updatepri() can completely go away. Note, this 
is theory only. Is this thinking valid?

> > The only reason I added quantum, was that I stumbled on the round robin 
> > interval buglet. Initially added a fixed 100 ms per proc, and then decided 
> > how much I could explore this quantum idea while still trying to keep the 
> > code understandable.
> 
> Which buglet?  Should we fix it?

A minimal diff for the round robin interval buglet is attached at the end of 
this email, doesn't use the SCHED_LOCK(). Note, I removed all the different 
quantums, just kept it at the default 100 ms, to try to convey and determine if 
it is my misunderstanding, and there is no buglet. I still haven't looked 
deeply at implementing 

Re: scheduler small changes

2019-05-17 Thread Amit Kulkarni
On Thu, 16 May 2019 15:15:24 -0300
Martin Pieuchot  wrote:

> On 16/05/19(Thu) 00:08, Amit Kulkarni wrote:
> > [...] 
> > > Regarding the choice of deriving quantum from the priority, are you sure
> > > the priorities are correct?  Should we keep priorities?  Or if userland
> > > needs priorities shouldn't we convert quantum into priority and not the
> > > other way around?
> > 
> > I am not entirely sure of the p_priority/usrpri/estcpu/load_avg 
> > calculations, as I am still trying to make sense of the code. But once we 
> > make sure all the p_priority calculations are consistent, I think the 
> > priorities are the way to go.
> 
> Why?  What's your goal?  In other words what kind of scheduling policy you
> want?  When should another thread be selected?  What about preemption? 
> 
> > If we go by deadlines, we will not have a way to understand how a proc is 
> > behaving in real time
> 
> What do you mean by behaving?  How much time did it spend running?  On
> which CPU?  Can't we monitor that?
> 
> > as a p_deadline is a history variable, how much a proc used its time 
> > slices. But if we stay with priorities, it is way simpler, and ranges 
> > strictly from 0 -> 127, and will work with current code. So the code 
> > changes will be minimal and easy to understand. Also, UNIX has historically 
> > had a concept of priority/nice levels, so I think we should stick with it.
> 
> So it's because it is simpler?  But if we follow this road, not changing
> anything is also simpler.  So why do you want to change the current
> code? :)
> 
> 
> > IMHO, a p_deadline field is a better substitute for p_slptime.
> 
> Why?
> 
> > The only reason I added quantum, was that I stumbled on the round robin 
> > interval buglet. Initially added a fixed 100 ms per proc, and then decided 
> > how much I could explore this quantum idea while still trying to keep the 
> > code understandable.
> 
> Which buglet?  Should we fix it?
> 
> > I would guess a lot of code in userland is based on priorities, the 
> > GNOME/KDE equivalent of top/classical top/htop etc... I would think of 
> > p_priority as a real time tracking field of how hot the proc is in using 
> > the CPU. In order to change the quantum, how would we decide to increment 
> > or decrement it, unless by a real time tracking field? There's code which 
> > already does this tracking for p_priority, it might be flawed or complex, 
> > so why not fix it and use it?
> 
> What do you mean by 'real time tracking field'?  I'd suggest you look at
> how priorities are set when a thread goes to sleep and how they are used
> when it is awoken.


Hi Martin,
I feel that you are trying to point out something by giving me some hints. 
Please give me some time to think through all possible scenarios, and I will 
send a detailed reply to all the points you raised above in this email, with a 
reduced+functional diff, over this weekend.

@Solene, thanks for testing, please ignore the previous diff for now.

Thanks



Re: scheduler small changes

2019-05-15 Thread Amit Kulkarni
> Why did you decide to change the data structure of the runqueue?  What
> problem are you trying to solve?

Thanks for your feedback. It forced me to do some introspection.

I was trying to explore if we can tweak and make the current code faster, while 
still tryign to keep it as simple as it is currently.

The explanation I gave in the diff for the removal of 32 TAILQs is wrong. That 
flawed analysis was set in my mind weeks ago, when I was just starting to read 
the scheduler code. I understand now that the current code is quite a bit 
better. It places a proc with low value of p_priority in lower array index and 
will return it first, and a higher value of p_priority will be placed in higher 
array index. So a proc is being carefully placed in sorted order of p_priority. 
I will revert this data structure change as the single TAILQ will not return a 
proc in the sorted p_priority order.

> Regarding your work, if you want to continue in the scheduler area, may
> I suggest you start by making the global counters per-spc and export
> them to userland via a syscl?  Add a new view to systat(1) to see what's
> happening.  Without more visibility it's hard to confirm any theory.

This is an excellent idea. It will take me some time to understand the 
sysctl/systat part and implement, but I will try to address within a week or 
two.

> 
> Regarding the choice of deriving quantum from the priority, are you sure
> the priorities are correct?  Should we keep priorities?  Or if userland
> needs priorities shouldn't we convert quantum into priority and not the
> other way around?

I am not entirely sure of the p_priority/usrpri/estcpu/load_avg calculations, 
as I am still trying to make sense of the code. But once we make sure all the 
p_priority calculations are consistent, I think the priorities are the way to 
go. If we go by deadlines, we will not have a way to understand how a proc is 
behaving in real time, as a p_deadline is a history variable, how much a proc 
used its time slices. But if we stay with priorities, it is way simpler, and 
ranges strictly from 0 -> 127, and will work with current code. So the code 
changes will be minimal and easy to understand. Also, UNIX has historically had 
a concept of priority/nice levels, so I think we should stick with it. IMHO, a 
p_deadline field is a better substitute for p_slptime.

The only reason I added quantum, was that I stumbled on the round robin 
interval buglet. Initially added a fixed 100 ms per proc, and then decided how 
much I could explore this quantum idea while still trying to keep the code 
understandable.

I would guess a lot of code in userland is based on priorities, the GNOME/KDE 
equivalent of top/classical top/htop etc... I would think of p_priority as a 
real time tracking field of how hot the proc is in using the CPU. In order to 
change the quantum, how would we decide to increment or decrement it, unless by 
a real time tracking field? There's code which already does this tracking for 
p_priority, it might be flawed or complex, so why not fix it and use it?

> Regarding your diff, if you find a thread in RUN state inside the sleep
> queue (wakeup_n()), something is really wrong there.

I added that SRUN assert weeks ago when I was trying to add some code and 
figure out things, and broke it immediately on boot, the assert made the 
problem go away, and till your noticing it, I forgot about it. Removed it today 
and compiled kernel/userland while browsing on a desktop and a kernel build on 
another dekstop. It worked out fine.

Thanks



scheduler small changes

2019-05-15 Thread Amit Kulkarni
Hi,

This effort is heavily based on top of Gregor's and Michal's diffs. Tried to 
incorporate feedback given by different people to them in 2011/2016. Split the 
new code in a ifdef, so people can do a straight comparison, tried very hard 
not to delete existing code, just shifted it around. Main motivation was to 
find if it is possible to do incremental improvements in scheduler. After 
sometime, have still not understood completely the load avg part, 
p_priority/p_usrpri calculations, the cpuset code. Looked heavily at Dragonfly 
(simpler than FreeBSD) and FreeBSD code. As a newcomer, OpenBSD code is way 
easier to read. Thanks for the detailed explanations given to Michal and also 
in later diffs posted to tech@, they helped a lot when trying to understand the 
decisions made by other devs, the commit messages help a little but the 
explanations help a lot more. This invites discussions and end users learn a 
lot more in the process.

This diff survived multiple tens of kernel builds, a bsd.sp build, bsd.rd 
build, a bsd.mp without these changes, userland/xenocara, a make regress a few 
days ago all on 3 desktops on amd64. Ran under all possible scenarios listed in 
previous sentence. No major invasive changes attempted, all tools should work 
as is, if its broken, please let me know. This is faster than current, but not 
sure by how much, placebo effect in play.

I think  there is a bug in resched_proc() which is fixed in mi_switch(), a 
quick enhancement in cpu_switchto(), p_slptime, and precise round robin 
interval for each proc unless preempted or it yields().

Tried to fiddle with different time slices other than hz/10, not sure if it 
will work on other arches, but tried to stay within MI code, so it should work. 
Other than counting no idea to verify a proc is actually switched away after 
its rr interval is over. Just went by what is there in the existing code. Tried 
giving higher slice like 200 ms, but didn't want to exceed the rucheck() in 
kern_resource 1 sec limit.

Tried to do rudimentary work on affinity without having a affinity field in 
struct proc or struct schedstate_percpu (like Dragonfly/FreeBSD does). Observed 
the unbalance in runqs. Affinity works most of the time under light load. 
There's a problem when I try to comment out sched_steal_proc(), in kern_sched, 
that is the problem with this iteration of the diff.

Not sure if the reduction from 32 queues to a single TAILQ would be accepted, 
but tried it anyway, it is definitely faster. This code tries to reduce the 
complexity in deciding which queue to place the proc on. There is no current 
support for a real-time queue or other types of scheduling classes, so IMHO 
this is a simplification.

Tried to give detailed explanation of thinking. Sent the complete diff, but 
will split diff, if parts of it are found to be valid.

In any case, a request to please accept a small change in top, to display 
p_priority directly.

Thanks


diff --git a/sys/conf/GENERIC b/sys/conf/GENERIC
index d6a4fdcb0e2..73995746e23 100644
--- a/sys/conf/GENERIC
+++ b/sys/conf/GENERIC
@@ -3,6 +3,7 @@
 #  Machine-independent option; used by all architectures for their
 #  GENERIC kernel
 
+option TSTSCHED# test scheduler
 option DDB # in-kernel debugger
 #optionDDBPROF # ddb(4) based profiling
 #optionDDB_SAFE_CONSOLE # allow break into ddb during boot
diff --git a/sys/kern/init_main.c b/sys/kern/init_main.c
index effefd552c3..be58199d834 100644
--- a/sys/kern/init_main.c
+++ b/sys/kern/init_main.c
@@ -129,6 +129,10 @@ intncpus =  1;
 intncpusfound = 1; /* number of cpus we find */
 volatile int start_init_exec;  /* semaphore for start_init() */
 
+#ifdef TSTSCHED
+int sched_start_dquantum = 0;
+#endif
+
 #if !defined(NO_PROPOLICE)
 long   __guard_local __attribute__((section(".openbsd.randomdata")));
 #endif
@@ -572,6 +576,21 @@ main(void *framep)
 
start_periodic_resettodr();
 
+   /* Inited all systems, ready to start the test scheduler now!
+*
+* I saw some instability, when rr_intvl not set to hz/10 during
+* boot, so resorting to fiddling with rr interval after we
+* have booted up fully.
+* */
+#ifdef TSTSCHED
+   printf("-\n");
+   printf("-\n");
+   printf("-  TEST SCHEDULER starting   -\n");
+   printf("-\n");
+   printf("-\n");
+   sched_start_dquantum = 1;
+#endif
+
 /*
  * proc0: nothing to do, back to sleep
  */
diff --git a/sys/kern/kern_clock.c b/sys/kern/kern_clock.c
index ee701945966..8db6611bff3 100644
--- a/sys/kern/kern_clock.c
+++ b/sys/kern/kern_clock.c
@@ -143,7 +143,6 @@ void
 

Re: SCHED_LOCK vs 'struct proc'

2019-05-11 Thread Amit Kulkarni
On Sat, May 11, 2019 at 6:29 PM Martin Pieuchot  wrote:
>
> Document which fields are protected by the SCHED_LOCK(), ok?
>
> Index: sys/proc.h
> ===
> RCS file: /cvs/src/sys/sys/proc.h,v
> retrieving revision 1.263
> diff -u -p -r1.263 proc.h
> --- sys/proc.h  6 Jan 2019 12:59:45 -   1.263
> +++ sys/proc.h  11 May 2019 23:21:57 -
> @@ -297,8 +297,12 @@ struct process {
>  struct kcov_dev;
>  struct lock_list_entry;
>
> +/*
> + *  Locks used to protect struct members in this file:
> + * s   scheduler lock
> + */
>  struct proc {
> -   TAILQ_ENTRY(proc) p_runq;
> +   TAILQ_ENTRY(proc) p_runq;   /* [s] current run/sleep queue */
> LIST_ENTRY(proc) p_list;/* List of all threads. */
>
> struct  process *p_p;   /* The process of this thread. */
> @@ -314,7 +318,7 @@ struct proc {
>
> int p_flag; /* P_* flags. */
> u_char  p_spare;/* unused */
> -   charp_stat; /* S* process status. */
> +   charp_stat; /* [s] S* process status. */
> charp_pad1[1];
> u_char  p_descfd;   /* if not 255, fdesc permits this fd 
> */
>
> @@ -328,17 +332,17 @@ struct proc {
> longp_thrslpid; /* for thrsleep syscall */
>
> /* scheduling */
> -   u_int   p_estcpu;/* Time averaged value of p_cpticks. */
> +   u_int   p_estcpu;   /* [s] Time averaged val of p_cpticks 
> */
> int p_cpticks;   /* Ticks of cpu time. */
> -   const volatile void *p_wchan;/* Sleep address. */
> +   const volatile void *p_wchan;   /* [s] Sleep address. */
> struct  timeout p_sleep_to;/* timeout for tsleep() */
> -   const char *p_wmesg; /* Reason for sleep. */
> -   fixpt_t p_pctcpu;/* %cpu for this thread */
> -   u_int   p_slptime;   /* Time since last blocked. */
> +   const char *p_wmesg;/* [s] Reason for sleep. */
> +   fixpt_t p_pctcpu;   /* [s] %cpu for this thread */
> +   u_int   p_slptime;  /* [s] Time since last blocked. */

Thanks for this diff, it is pretty useful for anyone reading SCHED_LOCK'd code.

Can you please modify description for p_slptime?
 /* [s] Time since last run (in secs). */

as p_slptime is reset to 0 in setrunnable() in sched_bsd.

> u_int   p_uticks;   /* Statclock hits in user mode. */
> u_int   p_sticks;   /* Statclock hits in system mode. */
> u_int   p_iticks;   /* Statclock hits processing intr. */
> -   struct  cpu_info * volatile p_cpu; /* CPU we're running on. */
> +   struct  cpu_info * volatile p_cpu; /* [s] CPU we're running on. */
>
> struct  rusage p_ru;/* Statistics */
> struct  tusage p_tu;/* accumulated times. */
> @@ -357,8 +361,8 @@ struct proc {
> vaddr_t  p_spstart;
> vaddr_t  p_spend;
>
> -   u_char  p_priority; /* Process priority. */
> -   u_char  p_usrpri;   /* User-priority based on p_estcpu and 
> ps_nice. */
> +   u_char  p_priority; /* [s] Process priority. */
> +   u_char  p_usrpri;   /* [s] User-prio based on p_estcpu & ps_nice. 
> */
> int p_pledge_syscall;   /* Cache of current syscall */
>
> struct  ucred *p_ucred; /* cached credentials */
>



misleading comment in preempt()

2019-03-09 Thread Amit Kulkarni
there is no process supplied, comment is misleading...

diff --git kern/sched_bsd.c kern/sched_bsd.c
index 00a08861b59..0b63276d1ff 100644
--- kern/sched_bsd.c
+++ kern/sched_bsd.c
@@ -318,9 +318,7 @@ yield(void)
 
 /*
  * General preemption call.  Puts the current process back on its run queue
- * and performs an involuntary context switch.  If a process is supplied,
- * we switch to that process.  Otherwise, we use the normal process selection
- * criteria.
+ * and performs an involuntary context switch.
  */
 void
 preempt(void)



Re: sched_choosecpu_fork()

2019-02-20 Thread Amit Kulkarni
> In sched_choosecpu_fork(), we see best_run is INT_MAX, the comparison below 
> actually reduces to a assignment ==> choice = ci;
> -   if (choice == NULL || run < best_run ||
> -   (run == best_run & < best_load)) {
> -   choice = ci;
> -   best_load = load;
> -   best_run = run;
> -   }
> 
> because run will always be less than INT_MAX + run can rarely be INT_MAX, and 
> choice is always NULL!
> 
> In sched_choosecpu(), last_cost is not being used where it could be useful. 
> Setting to INT_MAX, the comparison will always be false. This is like you can 
> safely delete sched_proc_to_cpu_cost() also. choice is again null.
> 
>   int cost = sched_proc_to_cpu_cost(ci, p);
> 
>   if (choice == NULL || cost < last_cost) {
>   choice = ci;
>   last_cost = cost;
>   }
>   cpuset_del(, ci);
> 

For the above I just realized this is only true for the first time in the loop, 
so please ignore this above part!



Re: sched_choosecpu_fork()

2019-02-20 Thread Amit Kulkarni
On Wed, 20 Feb 2019 14:08:45 -0300
Martin Pieuchot  wrote:

> When choosing the initial CPU for a new thread to run on, the scheduler
> do not consider that CPU0 is handling interrupts.
> 
> Threads that are created early, like the network taskq, always end up
> scheduled on CPU0.  If the machine isn't running many programs, like
> simple firewalls, there won't be contention and the thread will always
> stay on its original CPU.
> 
> This behavior derives to how yield() and preempt() are implemented.  These
> functions do not look for another CPU to run on when a busy thread tries to
> be cooperative.  This is what we want.  However in the case of the  network
> taskq we'd prefer to avoid CPU0.
> 
> I don't know if enforcing this behavior makes sense or if we should
> accept that in some moon phases our forwarding performances are only 50%
> of what they could be.
> 
> However using the same logic to pick a CPU during fork than during a
> sleep cycle seems to be good enough to avoid the problem described above.
> If there's contention the taskq will move between CPUs, if there isn't it
> will stay, generally, on CPU1.
> 
> Since FORK_PPWAIT isn't implemented since 10 years, I'd say it's also a
> simplification :o)
> 
> What would you think of killing sched_choosecpu_fork()?
> 
> Any other suggestion?
> 
> ok?
> 
> Index: kern/kern_fork.c
> ===
> RCS file: /cvs/src/sys/kern/kern_fork.c,v
> retrieving revision 1.209
> diff -u -p -r1.209 kern_fork.c
> --- kern/kern_fork.c  6 Jan 2019 12:59:45 -   1.209
> +++ kern/kern_fork.c  15 Feb 2019 13:49:32 -
> @@ -329,7 +329,7 @@ fork_thread_start(struct proc *p, struct
>  
>   SCHED_LOCK(s);
>   p->p_stat = SRUN;
> - p->p_cpu = sched_choosecpu_fork(parent, flags);
> + p->p_cpu = sched_choosecpu(parent);
>   setrunqueue(p);
>   SCHED_UNLOCK(s);
>  }
> Index: kern/kern_sched.c
> ===
> RCS file: /cvs/src/sys/kern/kern_sched.c,v
> retrieving revision 1.54
> diff -u -p -r1.54 kern_sched.c
> --- kern/kern_sched.c 17 Nov 2018 23:10:08 -  1.54
> +++ kern/kern_sched.c 15 Feb 2019 13:50:06 -
> @@ -340,62 +340,6 @@ again:
>  }
>  
>  struct cpu_info *
> -sched_choosecpu_fork(struct proc *parent, int flags)
> -{
> -#ifdef MULTIPROCESSOR
> - struct cpu_info *choice = NULL;
> - fixpt_t load, best_load = ~0;
> - int run, best_run = INT_MAX;
> - struct cpu_info *ci;
> - struct cpuset set;
> -
> -#if 0
> - /*
> -  * XXX
> -  * Don't do this until we have a painless way to move the cpu in exec.
> -  * Preferably when nuking the old pmap and getting a new one on a
> -  * new cpu.
> -  */
> - /*
> -  * PPWAIT forks are simple. We know that the parent will not
> -  * run until we exec and choose another cpu, so we just steal its
> -  * cpu.
> -  */
> - if (flags & FORK_PPWAIT)
> - return (parent->p_cpu);
> -#endif
> -
> - /*
> -  * Look at all cpus that are currently idle and have nothing queued.
> -  * If there are none, pick the one with least queued procs first,
> -  * then the one with lowest load average.
> -  */
> - cpuset_complement(, _queued_cpus, _idle_cpus);
> - cpuset_intersection(, , _all_cpus);
> - if (cpuset_first() == NULL)
> - cpuset_copy(, _all_cpus);
> -
> - while ((ci = cpuset_first()) != NULL) {
> - cpuset_del(, ci);
> -
> - load = ci->ci_schedstate.spc_ldavg;
> - run = ci->ci_schedstate.spc_nrun;
> -
> - if (choice == NULL || run < best_run ||
> - (run == best_run & < best_load)) {
> - choice = ci;
> - best_load = load;
> - best_run = run;
> - }
> - }
> -
> - return (choice);
> -#else
> - return (curcpu());
> -#endif
> -}
> -
> -struct cpu_info *
>  sched_choosecpu(struct proc *p)
>  {
>  #ifdef MULTIPROCESSOR
> Index: sys/sched.h
> ===
> RCS file: /cvs/src/sys/sys/sched.h,v
> retrieving revision 1.50
> diff -u -p -r1.50 sched.h
> --- sys/sched.h   17 Nov 2018 23:10:08 -  1.50
> +++ sys/sched.h   15 Feb 2019 13:50:14 -
> @@ -150,7 +150,6 @@ void mi_switch(void);
>  void cpu_switchto(struct proc *, struct proc *);
>  struct proc *sched_chooseproc(void);
>  struct cpu_info *sched_choosecpu(struct proc *);
> -struct cpu_info *sched_choosecpu_fork(struct proc *parent, int);
>  void cpu_idle_enter(void);
>  void cpu_idle_cycle(void);
>  void cpu_idle_leave(void);
> 

In sched_choosecpu_fork(), we see best_run is INT_MAX, the comparison below 
actually reduces to a assignment ==> choice = ci;
-   if (choice == NULL || run < best_run ||
-   (run == best_run & < best_load)) {
-   choice = ci;
- 

Re: sysctl kern.pool.XXX

2019-01-30 Thread Amit Kulkarni
> > This comment below is misleading. There are no such sysctl's. The defines 
> > are only used in kern/subr_pool.c
> >
>
> ... to implement the sysctls. What are you talking about?
>

Ah, I see. These are not present, leading me to wonder, why are they
being mentioned here?



sysctl kern.pool.XXX

2019-01-30 Thread Amit Kulkarni
This comment below is misleading. There are no such sysctl's. The defines are 
only used in kern/subr_pool.c



diff --git sys/pool.h sys/pool.h
index d2f05227b7a..e97f774a272 100644
--- sys/pool.h
+++ sys/pool.h
@@ -34,12 +34,6 @@
 #ifndef _SYS_POOL_H_
 #define _SYS_POOL_H_
 
-/*
- * sysctls.
- * kern.pool.npools
- * kern.pool.name.
- * kern.pool.pool.
- */
 #define KERN_POOL_NPOOLS   1
 #define KERN_POOL_NAME 2
 #define KERN_POOL_POOL 3



remove 2 duplicate functions in uvm anon

2019-01-15 Thread Amit Kulkarni
uao_reference() vs uao_reference_locked()
uao_detach() vs uao_detach_locked()

When you read the code, you can see that both are calling the same function, so 
fold them into the one function which is used by external callers, and remove 
the internal function call, but preserve its comments. Tested on amd64.
Thanks


diff --git uvm/uvm_aobj.c uvm/uvm_aobj.c
index 63e6c993fc2..5fb1fbd9b2a 100644
--- uvm/uvm_aobj.c
+++ uvm/uvm_aobj.c
@@ -810,16 +810,6 @@ uao_init(void)
 void
 uao_reference(struct uvm_object *uobj)
 {
-   uao_reference_locked(uobj);
-}
-
-/*
- * uao_reference_locked: add a ref to an aobj
- */
-void
-uao_reference_locked(struct uvm_object *uobj)
-{
-
/* kernel_object already has plenty of references, leave it alone. */
if (UVM_OBJ_IS_KERN_OBJECT(uobj))
return;
@@ -830,21 +820,11 @@ uao_reference_locked(struct uvm_object *uobj)
 
 /*
  * uao_detach: drop a reference to an aobj
- */
-void
-uao_detach(struct uvm_object *uobj)
-{
-   uao_detach_locked(uobj);
-}
-
-
-/*
- * uao_detach_locked: drop a reference to an aobj
  *
  * => aobj may freed upon return.
  */
 void
-uao_detach_locked(struct uvm_object *uobj)
+uao_detach(struct uvm_object *uobj)
 {
struct uvm_aobj *aobj = (struct uvm_aobj *)uobj;
struct vm_page *pg;
@@ -1286,7 +1266,7 @@ uao_swap_off(int startslot, int endslot)
 * add a ref to the aobj so it doesn't disappear
 * while we're working.
 */
-   uao_reference_locked(>u_obj);
+   uao_reference(>u_obj);
 
/*
 * now it's safe to unlock the uao list.
@@ -1295,7 +1275,7 @@ uao_swap_off(int startslot, int endslot)
mtx_leave(_list_lock);
 
if (prevaobj) {
-   uao_detach_locked(>u_obj);
+   uao_detach(>u_obj);
prevaobj = NULL;
}
 
@@ -1305,7 +1285,7 @@ uao_swap_off(int startslot, int endslot)
 */
rv = uao_pagein(aobj, startslot, endslot);
if (rv) {
-   uao_detach_locked(>u_obj);
+   uao_detach(>u_obj);
return rv;
}
 
@@ -1328,7 +1308,7 @@ uao_swap_off(int startslot, int endslot)
/* done with traversal, unlock the list */
mtx_leave(_list_lock);
if (prevaobj) {
-   uao_detach_locked(>u_obj);
+   uao_detach(>u_obj);
}
return FALSE;
 }
diff --git uvm/uvm_extern.h uvm/uvm_extern.h
index a473f251229..9402e41fe10 100644
--- uvm/uvm_extern.h
+++ uvm/uvm_extern.h
@@ -261,9 +261,7 @@ voidvmapbuf(struct buf *, vsize_t);
 void   vunmapbuf(struct buf *, vsize_t);
 struct uvm_object  *uao_create(vsize_t, int);
 void   uao_detach(struct uvm_object *);
-void   uao_detach_locked(struct uvm_object *);
 void   uao_reference(struct uvm_object *);
-void   uao_reference_locked(struct uvm_object *);
 intuvm_fault(vm_map_t, vaddr_t, vm_fault_t, vm_prot_t);
 
 vaddr_tuvm_uarea_alloc(void);



Re: replacing timeout_add() with timeout_add_msec()

2019-01-12 Thread Amit Kulkarni
> You started to convert the wrong timeout_add() calls. Doing a blind
> timeout_add() to timeout_add_msec() conversion especially on calls with 0
> or 1 tick sleep time is the wrong approach.
> The right approach is to identify calls that do sleep for a well defined time
> (1sec, 50ms or whatever) and convert those. Most of the timeouts you
> changes are not in that category. First I would look at timeouts that have
> a multiplication with hz in them since those wait for a specific time.

Thanks for the feedback Claudio and Mark!

Here is a new diff based on your suggestions, looking for feedback, no tests 
requested yet. I assumed in below diff that 1hz == 1sec (confirmed in 
timeout_add_sec() in /sys/kern/kern_timeout.c), and converted some 
multiplication and some divisions, all related to hz. Did some conversions in 
comments also, because in future they might be used.
-- amit


diff --git sys/arch/armv7/exynos/exuart.c sys/arch/armv7/exynos/exuart.c
index 4b0588750ea..15086bc5976 100644
--- sys/arch/armv7/exynos/exuart.c
+++ sys/arch/armv7/exynos/exuart.c
@@ -283,7 +283,7 @@ exuart_intr(void *arg)
if (p >= sc->sc_ibufend) {
sc->sc_floods++;
if (sc->sc_errors++ == 0)
-   timeout_add(>sc_diag_tmo, 60 * hz);
+   timeout_add_sec(>sc_diag_tmo, 60);
} else {
*p++ = c;
if (p == sc->sc_ibufhigh && ISSET(tp->t_cflag, CRTSCTS))
@@ -710,7 +710,7 @@ exuartclose(dev_t dev, int flag, int mode, struct proc *p)
/* tty device is waiting for carrier; drop dtr then re-raise */
//CLR(sc->sc_ucr3, EXUART_CR3_DSR);
//bus_space_write_2(iot, ioh, EXUART_UCR3, sc->sc_ucr3);
-   timeout_add(>sc_dtr_tmo, hz * 2);
+   timeout_add_sec(>sc_dtr_tmo, 2);
} else {
/* no one else waiting; turn off the uart */
exuart_pwroff(sc);
diff --git sys/arch/sparc64/dev/fd.c sys/arch/sparc64/dev/fd.c
index 8d548062f83..654d8c95524 100644
--- sys/arch/sparc64/dev/fd.c
+++ sys/arch/sparc64/dev/fd.c
@@ -1632,7 +1632,7 @@ loop:
fdc->sc_state = RECALCOMPLETE;
if (fdc->sc_flags & FDC_NEEDHEADSETTLE) {
/* allow 1/30 second for heads to settle */
-   timeout_add(>fdcpseudointr_to, hz / 30);
+   timeout_add_msec(>fdcpseudointr_to, 33);
return (1); /* will return later */
}
 
diff --git sys/dev/fdt/imxuart.c sys/dev/fdt/imxuart.c
index 84c7eb5aee6..c2fd7e4a6d3 100644
--- sys/dev/fdt/imxuart.c
+++ sys/dev/fdt/imxuart.c
@@ -228,7 +228,7 @@ imxuart_intr(void *arg)
if (p >= sc->sc_ibufend) {
sc->sc_floods++;
if (sc->sc_errors++ == 0)
-   timeout_add(>sc_diag_tmo, 60 * hz);
+   timeout_add_sec(>sc_diag_tmo, 60);
} else {
*p++ = c;
if (p == sc->sc_ibufhigh &&
@@ -457,7 +457,7 @@ imxuart_softint(void *arg)
if (ISSET(c, IMXUART_RX_OVERRUN)) {
sc->sc_overflows++;
if (sc->sc_errors++ == 0)
-   timeout_add(>sc_diag_tmo, 60 * hz);
+   timeout_add_sec(>sc_diag_tmo, 60);
}
/* This is ugly, but fast. */
 
@@ -629,7 +629,7 @@ imxuartclose(dev_t dev, int flag, int mode, struct proc *p)
/* tty device is waiting for carrier; drop dtr then re-raise */
CLR(sc->sc_ucr3, IMXUART_CR3_DSR);
bus_space_write_2(iot, ioh, IMXUART_UCR3, sc->sc_ucr3);
-   timeout_add(>sc_dtr_tmo, hz * 2);
+   timeout_add_sec(>sc_dtr_tmo, 2);
} else {
/* no one else waiting; turn off the uart */
imxuart_pwroff(sc);
diff --git sys/dev/ic/if_wi_hostap.c sys/dev/ic/if_wi_hostap.c
index 64e3c10f3f5..155a391e7f9 100644
--- sys/dev/ic/if_wi_hostap.c
+++ sys/dev/ic/if_wi_hostap.c
@@ -410,7 +410,7 @@ wihap_sta_timeout(void *v)
 
/* Add wihap timeout if we have not already done so. */
if (!timeout_pending(>tmo))
-   timeout_add(>tmo, hz / 10);
+   timeout_add_msec(>tmo, 100);
 
splx(s);
 }
diff --git sys/dev/ic/pluart.c sys/dev/ic/pluart.c
index 0f024c0ad34..19bbb76f4a6 100644
--- sys/dev/ic/pluart.c
+++ sys/dev/ic/pluart.c
@@ -225,7 +225,7 @@ pluart_intr(void *arg)
if (p >= sc->sc_ibufend) {
sc->sc_floods++;
if (sc->sc_errors++ == 0)
-   timeout_add(>sc_diag_tmo, 60 * hz);
+   timeout_add_sec(>sc_diag_tmo, 60);
} else {
*p++ = c;
  

Re: replacing timeout_add() with timeout_add_msec()

2019-01-06 Thread Amit Kulkarni
> > Even on amd64, I won't be able to test, because of missing hardware.
> > If you think something is wrong, please will you let me have your
> > feedback?
> 
> I'm a bit stunned at the zeal to push untested diffs into the tree
> 
> (you didn't ask anyone to test it for you)

I requested for critical review or feedback in the initial email, to know if I 
am going down the right path. If the approach is ok, then I was planning to sit 
down to try converting all the straightforward of timeout_add() to 
timeout_add_msec() calls in the tree, then a review again, before finally 
requesting a test run.

Requesting a test, when I am unsure of the approach in code, would lead to low 
or zero confidence from others when asked in future.



Re: replacing timeout_add() with timeout_add_msec()

2019-01-06 Thread Amit Kulkarni
Even on amd64, I won't be able to test, because of missing hardware.
If you think something is wrong, please will you let me have your
feedback?

Thanks


On Sun, Jan 6, 2019 at 4:56 PM Theo de Raadt  wrote:
>
> Amit Kulkarni  wrote:
>
> > Hi,
> >
> > Referring to the end of mpi's message, and also mlarkin@ later comment 
> > https://marc.info/?l=openbsd-tech=154577028830964=2
> >
> > I am trying to replace some easy timeout_add() calls with 
> > timeout_add_msec().
> >
> > My current understanding with the occurences of timeout_add() in the tree 
> > is that: if there is a hardcoded call like timeout_add(struct timeout, 1), 
> > then replace with timeout_add_msec(struct timeout, 10). That is, 1 tick = 
> > 10 msec.
> >
> > So if there's a hardcoded call like timeout_add(struct timeout, 5), then 
> > replace with timeout_add_msec(struct timeout, 50).
> >
> > If there are hz calculations which I don't understand like for example in 
> > /sys/arch/alpha/tc/ioasic.c, then I am skipping these for now.
> > if (alpha_led_blink != 0) {
> > timeout_set(_blink_state.tmo, ioasic_led_blink, NULL);
> > timeout_add(_blink_state.tmo,
> > (((averunnable.ldavg[0] + FSCALE) * hz) >> (FSHIFT + 
> > 3)));
> > }
> >
> > A call like timeout_add(struct timeout, 0) is replaced by an equivalent 
> > call to timeout_add_msec(struct timeout, 0).
> >
> > Both the above scenarios are in the following diff and un-tested (not 
> > compiled also, for now), no way I can test some of these, as I don't have 
> > access to hardware. Mainly looking for critical review and feedback to get 
> > this going in the right direction.
> >
> > Thanks for your time!
> >
> > diff --git arch/alpha/alpha/promcons.c arch/alpha/alpha/promcons.c
> > index 9efabd3bf1c..b872f6e3931 100644
> > --- arch/alpha/alpha/promcons.c
> > +++ arch/alpha/alpha/promcons.c
> > @@ -100,7 +100,7 @@ promopen(dev, flag, mode, p)
> >   error = (*linesw[tp->t_line].l_open)(dev, tp, p);
> >   if (error == 0 && setuptimeout) {
> >   timeout_set(_to, promtimeout, tp);
> > - timeout_add(_to, 1);
> > + timeout_add_msec(_to, 10);
> >   }
> >   return error;
> >  }
> > @@ -220,7 +220,7 @@ promtimeout(v)
> >   if (tp->t_state & TS_ISOPEN)
> >   (*linesw[tp->t_line].l_rint)(c, tp);
> >   }
> > - timeout_add(_to, 1);
> > + timeout_add_msec(_to, 10);
> >  }
> >
> >  struct tty *
>
> I am glad you have an alpha, and will be able to test your proposed change.



replacing timeout_add() with timeout_add_msec()

2019-01-06 Thread Amit Kulkarni
Hi,

Referring to the end of mpi's message, and also mlarkin@ later comment 
https://marc.info/?l=openbsd-tech=154577028830964=2

I am trying to replace some easy timeout_add() calls with timeout_add_msec().

My current understanding with the occurences of timeout_add() in the tree is 
that: if there is a hardcoded call like timeout_add(struct timeout, 1), then 
replace with timeout_add_msec(struct timeout, 10). That is, 1 tick = 10 msec.

So if there's a hardcoded call like timeout_add(struct timeout, 5), then 
replace with timeout_add_msec(struct timeout, 50).

If there are hz calculations which I don't understand like for example in 
/sys/arch/alpha/tc/ioasic.c, then I am skipping these for now.
if (alpha_led_blink != 0) {
timeout_set(_blink_state.tmo, ioasic_led_blink, NULL);
timeout_add(_blink_state.tmo,
(((averunnable.ldavg[0] + FSCALE) * hz) >> (FSHIFT + 3)));
}

A call like timeout_add(struct timeout, 0) is replaced by an equivalent call to 
timeout_add_msec(struct timeout, 0).

Both the above scenarios are in the following diff and un-tested (not compiled 
also, for now), no way I can test some of these, as I don't have access to 
hardware. Mainly looking for critical review and feedback to get this going in 
the right direction.

Thanks for your time!

diff --git arch/alpha/alpha/promcons.c arch/alpha/alpha/promcons.c
index 9efabd3bf1c..b872f6e3931 100644
--- arch/alpha/alpha/promcons.c
+++ arch/alpha/alpha/promcons.c
@@ -100,7 +100,7 @@ promopen(dev, flag, mode, p)
error = (*linesw[tp->t_line].l_open)(dev, tp, p);
if (error == 0 && setuptimeout) {
timeout_set(_to, promtimeout, tp);
-   timeout_add(_to, 1);
+   timeout_add_msec(_to, 10);
}
return error;
 }
@@ -220,7 +220,7 @@ promtimeout(v)
if (tp->t_state & TS_ISOPEN)
(*linesw[tp->t_line].l_rint)(c, tp);
}
-   timeout_add(_to, 1);
+   timeout_add_msec(_to, 10);
 }
 
 struct tty *
diff --git arch/amd64/isa/clock.c arch/amd64/isa/clock.c
index db516d9ecde..9d5934e6817 100644
--- arch/amd64/isa/clock.c
+++ arch/amd64/isa/clock.c
@@ -326,7 +326,7 @@ rtcstart(void)
mc146818_write(NULL, MC_REGB, MC_REGB_24HR | MC_REGB_PIE);
 
/*
-* On a number of i386 systems, the rtc will fail to start when booting
+* On a number of amd64 systems, the rtc will fail to start when booting
 * the system. This is due to us missing to acknowledge an interrupt
 * during early stages of the boot process. If we do not acknowledge
 * the interrupt, the rtc clock will not generate further interrupts.
@@ -334,7 +334,7 @@ rtcstart(void)
 * to drain any un-acknowledged rtc interrupt(s).
 */
timeout_set(_timeout, rtcdrain, (void *)_timeout);
-   timeout_add(_timeout, 1);
+   timeout_add_msec(_timeout, 10);
 }
 
 void
diff --git arch/amd64/pci/pchb.c arch/amd64/pci/pchb.c
index 6e599d7be4a..80b5ada1cb4 100644
--- arch/amd64/pci/pchb.c
+++ arch/amd64/pci/pchb.c
@@ -332,7 +332,7 @@ pchb_rnd(void *v)
}
}
 
-   timeout_add(>sc_rng_to, 1);
+   timeout_add(>sc_rng_to, 10);
 }
 
 void
diff --git dev/ic/vga.c dev/ic/vga.c
index 74cc3e07bf8..2cebb65d2d5 100644
--- dev/ic/vga.c
+++ dev/ic/vga.c
@@ -765,7 +765,7 @@ vga_show_screen(void *v, void *cookie, int waitok, void 
(*cb)(void *, int, int),
if (cb) {
timeout_set(>vc_switch_timeout,
(void(*)(void *))vga_doswitch, vc);
-   timeout_add(>vc_switch_timeout, 0);
+   timeout_add(>vc_switch_timeout, 10);
return (EAGAIN);
}
 
diff --git net/if_pfsync.c net/if_pfsync.c
index 8d842e48466..dc8d2f41466 100644
--- net/if_pfsync.c
+++ net/if_pfsync.c
@@ -2318,7 +2318,7 @@ pfsync_bulk_start(void)
sc->sc_bulk_last = sc->sc_bulk_next;
 
pfsync_bulk_status(PFSYNC_BUS_START);
-   timeout_add(>sc_bulk_tmo, 0);
+   timeout_add_msec(>sc_bulk_tmo, 0);
}
 }
 





Re: More km_alloc(9)

2018-10-23 Thread Amit Kulkarni
> > Had the diff below in my tree for a very long time.  Switch several
> > uvm_km_alloc()/uvm_km_valloc() calls over to km_alloc().
> >
> > ok?
> >
>
> ok but there is a knf spacing issue (end-pa). other than that nit, ok mlarkin
>

tested to work fine on amd64.
thanks



Re: Remove VFSLCKDEBUG + ASSERT_VP_ISLOCKED (dead code in VFS)

2018-10-21 Thread Amit Kulkarni
> > After reading VOP_LOOKUP.9 based on recent commit, a try to remove some 
> > dead code in VFS.
> > https://marc.info/?l=openbsd-cvs=153886730207657=2
> > 
> > VFSLCKDEBUG is not defined anywhere. It is misleading to read in
> > sys/kern/vfs_vops.c that ASSERT_VP_ISLOCKED(dvp) is being checked, when in
> > fact, it is just dead code.
> 
> But you can build the kernel with -DVFSLCKDEBUG=1 to enable the debug code.


Aargh, you are right, silly me. I didn't grep it in GENERIC before sending this 
out. I compiled GENERIC with VFSLCKDEBUG on, and the kernel is running fine. So 
those checks are useful.

Forget about this diff!
Thanks



Remove VFSLCKDEBUG + ASSERT_VP_ISLOCKED (dead code in VFS)

2018-10-20 Thread Amit Kulkarni
Hi,

After reading VOP_LOOKUP.9 based on recent commit, a try to remove some dead 
code in VFS.
https://marc.info/?l=openbsd-cvs=153886730207657=2

VFSLCKDEBUG is not defined anywhere. It is misleading to read in 
sys/kern/vfs_vops.c that ASSERT_VP_ISLOCKED(dvp) is being checked, when in 
fact, it is just dead code.

Please review and comment!
Thanks


diff --git sys/kern/vfs_subr.c sys/kern/vfs_subr.c
index b89037e499a..2d09c6446c8 100644
--- sys/kern/vfs_subr.c
+++ sys/kern/vfs_subr.c
@@ -1049,9 +1049,6 @@ vclean(struct vnode *vp, int flags, struct proc *p)
VN_KNOTE(vp, NOTE_REVOKE);
vp->v_tag = VT_NON;
vp->v_flag &= ~VXLOCK;
-#ifdef VFSLCKDEBUG
-   vp->v_flag &= ~VLOCKSWORK;
-#endif
if (vp->v_flag & VXWANT) {
vp->v_flag &= ~VXWANT;
wakeup(vp);
@@ -1886,11 +1883,6 @@ vinvalbuf(struct vnode *vp, int flags, struct ucred 
*cred, struct proc *p,
struct buf *nbp, *blist;
int s, error;
 
-#ifdef VFSLCKDEBUG
-   if ((vp->v_flag & VLOCKSWORK) && !VOP_ISLOCKED(vp))
-   panic("%s: vp isn't locked, vp %p", __func__, vp);
-#endif
-
if (flags & V_SAVE) {
s = splbio();
vwaitforio(vp, 0, "vinvalbuf", 0);
diff --git sys/kern/vfs_vops.c sys/kern/vfs_vops.c
index 32fcb4a24cc..c1996e1e4a8 100644
--- sys/kern/vfs_vops.c
+++ sys/kern/vfs_vops.c
@@ -47,19 +47,6 @@
 #include 
 #include 
 
-#ifdef VFSLCKDEBUG
-#include  /* for panic() */
-
-#define ASSERT_VP_ISLOCKED(vp) do {\
-   if (((vp)->v_flag & VLOCKSWORK) && !VOP_ISLOCKED(vp)) { \
-   VOP_PRINT(vp);  \
-   panic("vp not locked"); \
-   }   \
-} while (0)
-#else
-#define ASSERT_VP_ISLOCKED(vp)  /* nothing */
-#endif
-
 int
 VOP_ISLOCKED(struct vnode *vp)
 {
@@ -102,8 +89,6 @@ VOP_CREATE(struct vnode *dvp, struct vnode **vpp,
a.a_cnp = cnp;
a.a_vap = vap;
 
-   ASSERT_VP_ISLOCKED(dvp);
-
if (dvp->v_op->vop_create == NULL)
return (EOPNOTSUPP);
 
@@ -124,8 +109,6 @@ VOP_MKNOD(struct vnode *dvp, struct vnode **vpp,
a.a_cnp = cnp;
a.a_vap = vap;
 
-   ASSERT_VP_ISLOCKED(dvp);
-
if (dvp->v_op->vop_mknod == NULL)
return (EOPNOTSUPP);
 
@@ -164,8 +147,6 @@ VOP_CLOSE(struct vnode *vp, int fflag, struct ucred *cred, 
struct proc *p)
a.a_cred = cred;
a.a_p = p;
 
-   ASSERT_VP_ISLOCKED(vp);
-
if (vp->v_op->vop_close == NULL)
return (EOPNOTSUPP);
 
@@ -184,8 +165,6 @@ VOP_ACCESS(struct vnode *vp, int mode, struct ucred *cred, 
struct proc *p)
a.a_cred = cred;
a.a_p = p;
 
-   ASSERT_VP_ISLOCKED(vp);
-
if (vp->v_op->vop_access == NULL)
return (EOPNOTSUPP);
 
@@ -219,8 +198,6 @@ VOP_SETATTR(struct vnode *vp, struct vattr *vap, struct 
ucred *cred,
a.a_cred = cred;
a.a_p = p;
 
-   ASSERT_VP_ISLOCKED(vp);
-
if (vp->v_op->vop_setattr == NULL)
return (EOPNOTSUPP);
 
@@ -239,8 +216,6 @@ VOP_READ(struct vnode *vp, struct uio *uio, int ioflag, 
struct ucred *cred)
a.a_ioflag = ioflag;
a.a_cred = cred;
 
-   ASSERT_VP_ISLOCKED(vp);
-
if (vp->v_op->vop_read == NULL)
return (EOPNOTSUPP);
 
@@ -258,8 +233,6 @@ VOP_WRITE(struct vnode *vp, struct uio *uio, int ioflag,
a.a_ioflag = ioflag;
a.a_cred = cred;
 
-   ASSERT_VP_ISLOCKED(vp);
-
if (vp->v_op->vop_write == NULL)
return (EOPNOTSUPP);
 
@@ -343,8 +316,6 @@ VOP_FSYNC(struct vnode *vp, struct ucred *cred, int waitfor,
a.a_waitfor = waitfor;
a.a_p = p;
 
-   ASSERT_VP_ISLOCKED(vp);
-
if (vp->v_op->vop_fsync == NULL)
return (EOPNOTSUPP);
 
@@ -363,9 +334,6 @@ VOP_REMOVE(struct vnode *dvp, struct vnode *vp, struct 
componentname *cnp)
 a.a_vp = vp;
a.a_cnp = cnp;
 
-   ASSERT_VP_ISLOCKED(dvp);
-   ASSERT_VP_ISLOCKED(vp);
-
if (dvp->v_op->vop_remove == NULL)
return (EOPNOTSUPP);
 
@@ -384,8 +352,6 @@ VOP_LINK(struct vnode *dvp, struct vnode *vp, struct 
componentname *cnp)
a.a_vp = vp;
a.a_cnp = cnp;
 
-   ASSERT_VP_ISLOCKED(dvp);
-
if (dvp->v_op->vop_link == NULL)
return (EOPNOTSUPP);
 
@@ -411,8 +377,6 @@ VOP_RENAME(struct vnode *fdvp, struct vnode *fvp,
a.a_tvp = tvp;
a.a_tcnp = tcnp;
 
-   ASSERT_VP_ISLOCKED(tdvp);
-
if (fdvp->v_op->vop_rename == NULL) 
return (EOPNOTSUPP);
 
@@ -435,8 +399,6 @@ VOP_MKDIR(struct vnode *dvp, struct vnode **vpp,
a.a_cnp = cnp;
a.a_vap = vap;
 
-   ASSERT_VP_ISLOCKED(dvp);
-
if (dvp->v_op->vop_mkdir == NULL)
return (EOPNOTSUPP);
 
@@ -455,9 +417,6 @@ VOP_RMDIR(struct vnode *dvp, 

Remove UVM_PAGE_OWN() and remove define UVM_PAGE_TRKOWN

2018-10-16 Thread Amit Kulkarni
Hi all,

Justification for removing this code:

1) This is currently dead code, UVM_PAGE_TRKOWN is undefined in GENERIC on all 
arches. If it is dead, while reading it gives incorrect impression that this 
code is useful.

2) The comment for uvm_page_own() mentions tracking down problems in PG_BUSY 
flag, but the usage is incorrect in this instance
When you do a atomic_setbits_int() and pass in a UVM_PAGE_OWN(ptmp, 
NULL) in line 1061 @ uvm_aobj.c, when you are unlocking there should be a 
previous call to atomic_clearbits_int(struct, PG_BUSY). This is the usual usage 
in other instances. This threw me off, and led me to figure out this is dead 
code.

Please review and comment.
Thanks in advance for your time!


diff --git sys/uvm/uvm.h sys/uvm/uvm.h
index 3e765a66226..29a31892c29 100644
--- sys/uvm/uvm.h
+++ sys/uvm/uvm.h
@@ -115,16 +115,6 @@ do {   
\
tsleep(event, PVM|(intr ? PCATCH : 0), msg, timo);  \
 } while (0)
 
-/*
- * UVM_PAGE_OWN: track page ownership (only if UVM_PAGE_TRKOWN)
- */
-
-#if defined(UVM_PAGE_TRKOWN)
-#define UVM_PAGE_OWN(PG, TAG) uvm_page_own(PG, TAG)
-#else
-#define UVM_PAGE_OWN(PG, TAG) /* nothing */
-#endif /* UVM_PAGE_TRKOWN */
-
 /*
  * uvm_map internal functions.
  * Used by uvm_map address selectors.
diff --git sys/uvm/uvm_amap.c sys/uvm/uvm_amap.c
index 5cf15f24317..63d67c4b770 100644
--- sys/uvm/uvm_amap.c
+++ sys/uvm/uvm_amap.c
@@ -719,7 +719,6 @@ ReStart:
 * PG_RELEASED | PG_WANTED.
 */
atomic_clearbits_int(>pg_flags, PG_BUSY|PG_FAKE);
-   UVM_PAGE_OWN(npg, NULL);
uvm_lock_pageq();
uvm_pageactivate(npg);
uvm_unlock_pageq();
diff --git sys/uvm/uvm_aobj.c sys/uvm/uvm_aobj.c
index 63e6c993fc2..73d7e16b041 100644
--- sys/uvm/uvm_aobj.c
+++ sys/uvm/uvm_aobj.c
@@ -1061,7 +1061,6 @@ uao_get(struct uvm_object *uobj, voff_t offset, struct 
vm_page **pps,
PG_BUSY|PG_FAKE);
atomic_setbits_int(>pg_flags,
PQ_AOBJ);
-   UVM_PAGE_OWN(ptmp, NULL);
}
}
 
@@ -1081,7 +1080,6 @@ uao_get(struct uvm_object *uobj, voff_t offset, struct 
vm_page **pps,
 */
/* caller must un-busy this page */
atomic_setbits_int(>pg_flags, PG_BUSY);
-   UVM_PAGE_OWN(ptmp, "uao_get1");
pps[lcv] = ptmp;
gotpages++;
 
@@ -1178,7 +1176,6 @@ uao_get(struct uvm_object *uobj, voff_t offset, struct 
vm_page **pps,
 */
/* we own it, caller must un-busy */
atomic_setbits_int(>pg_flags, PG_BUSY);
-   UVM_PAGE_OWN(ptmp, "uao_get2");
pps[lcv] = ptmp;
}
 
@@ -1219,7 +1216,6 @@ uao_get(struct uvm_object *uobj, voff_t offset, struct 
vm_page **pps,
wakeup(ptmp);
atomic_clearbits_int(>pg_flags,
PG_WANTED|PG_BUSY);
-   UVM_PAGE_OWN(ptmp, NULL);
uvm_lock_pageq();
uvm_pagefree(ptmp);
uvm_unlock_pageq();
@@ -1436,7 +1432,6 @@ uao_pagein_page(struct uvm_aobj *aobj, int pageidx)
slot = uao_set_swslot(>u_obj, pageidx, 0);
uvm_swap_free(slot, 1);
atomic_clearbits_int(>pg_flags, PG_BUSY|PG_CLEAN|PG_FAKE);
-   UVM_PAGE_OWN(pg, NULL);
 
/* deactivate the page (to put it on a page queue). */
pmap_clear_reference(pg);
diff --git sys/uvm/uvm_fault.c sys/uvm/uvm_fault.c
index 635283fac7b..4302bed85a8 100644
--- sys/uvm/uvm_fault.c
+++ sys/uvm/uvm_fault.c
@@ -356,7 +356,6 @@ uvmfault_anonget(struct uvm_faultinfo *ufi, struct vm_amap 
*amap,
/* un-busy! */
atomic_clearbits_int(>pg_flags,
PG_WANTED|PG_BUSY|PG_FAKE);
-   UVM_PAGE_OWN(pg, NULL);
 
/* 
 * if we were RELEASED during I/O, then our anon is
@@ -801,7 +800,6 @@ ReFault:
 */
atomic_clearbits_int([lcv]->pg_flags,
PG_BUSY);
-   UVM_PAGE_OWN(pages[lcv], NULL);
}   /* for "lcv" loop */
pmap_update(ufi.orig_map->pmap);
}   /* "gotpages" != 0 */
@@ -908,7 +906,6 @@ ReFault:
uvm_pagecopy(oanon->an_page, pg);   /* pg now !PG_CLEAN */

Re: witness report

2018-06-03 Thread Amit Kulkarni
On Sun, 3 Jun 2018 13:09:43 -0700
Philip Guenther  wrote:

> On Sun, Jun 3, 2018 at 12:51 PM, Amit Kulkarni  wrote:
> 
> > On Sun, 3 Jun 2018 10:37:30 -0700
> > Philip Guenther  wrote:
> >
> ...
> 
> > > Index: kern/kern_rwlock.c
> > > ===
> > > RCS file: /data/src/openbsd/src/sys/kern/kern_rwlock.c,v
> > > retrieving revision 1.35
> > > diff -u -p -r1.35 kern_rwlock.c
> > > --- kern/kern_rwlock.c21 Mar 2018 12:28:39 -  1.35
> > > +++ kern/kern_rwlock.c3 Jun 2018 17:00:02 -
> > > @@ -223,6 +223,8 @@ _rw_enter(struct rwlock *rwl, int flags
> > >   lop_flags = LOP_NEWORDER;
> > >   if (flags & RW_WRITE)
> > >   lop_flags |= LOP_EXCLUSIVE;
> > > + if (flags & RW_DUPOK)
> > > + lop_flags |= LOP_DUPOK;
> > >   if ((flags & RW_NOSLEEP) == 0 && (flags & RW_DOWNGRADE) == 0)
> > >   WITNESS_CHECKORDER(>rwl_lock_obj, lop_flags, file,
> > line,
> > >   NULL);
> > > Index: kern/vfs_subr.c
> > > ===
> > > RCS file: /data/src/openbsd/src/sys/kern/vfs_subr.c,v
> > > retrieving revision 1.273
> > > diff -u -p -r1.273 vfs_subr.c
> > > --- kern/vfs_subr.c   27 May 2018 06:02:14 -  1.273
> > > +++ kern/vfs_subr.c   3 Jun 2018 17:04:09 -
> > > @@ -188,6 +188,11 @@ vfs_busy(struct mount *mp, int flags)
> > >   else
> > >   rwflags |= RW_NOSLEEP;
> > >
> > > +#ifdef WITNESS
> > > + if (flags & VB_DUPOK)
> > > + rwflags |= RW_DUPOK;
> > > +#endif
> > > +
> >
> > The other parts where you added the dup are not checking for Witness. This
> > part above should be for all kernels, right? Witness or non-witness.
> >
> 
> No, the other code-generating additions, in kern_rwlock.c, are also inside
> #ifdef WITNESS, just outside of the context of the diff.  The RW_DUPOK flag
> has no effect if it's not a WITNESS kernel so excluding those lines is
> intentional.

My apologies, and sorry for the noise!



Re: witness report

2018-06-03 Thread Amit Kulkarni
On Sun, 3 Jun 2018 10:37:30 -0700
Philip Guenther  wrote:

> On Sun, 3 Jun 2018, Theo de Raadt wrote:
> > Philip Guenther  wrote:
> > > The warning is not that a single filesystem is being locked 
> > > recursively by a single thread, but just that a single thread is 
> > > holding locks on multiple filesystems.
> > 
> > vfs_stall() needs to grab locks on all filesystems, to stop a variety of 
> > filesystem transactions.  (Other types of transactions are blocked in 
> > other ways).
> > 
> > sys_umount() grabs locks on all filesystems above it, to stop anyone 
> > from doing a parallel mount/unmount along the same path.
> > 
> > This is all normal.
> 
> Diff below adds VB_DUPOK to indicate that a given vfs_busy() call is 
> expected/permitted to occur while the thread already holds another 
> filesystem busy; the caller is responsible for ensuring the filesystems 
> are locked in the correct order (or that NOWAIT is used safely as in the 
> sys_mount() case).
> 
> As part of this, this plumbs RW_DUPOK for rw_enter().
> 
> I no longer get the witness warning on hibernate with this.
> 
> ok?
> 
> Philip Guenther
> 
> 
> Index: sys/mount.h
> ===
> RCS file: /data/src/openbsd/src/sys/sys/mount.h,v
> retrieving revision 1.136
> diff -u -p -r1.136 mount.h
> --- sys/mount.h   8 May 2018 08:58:49 -   1.136
> +++ sys/mount.h   3 Jun 2018 17:05:28 -
> @@ -564,6 +564,7 @@ int   vfs_busy(struct mount *, int);
>  #define VB_WRITE 0x02
>  #define VB_NOWAIT0x04/* immediately fail on busy lock */
>  #define VB_WAIT  0x08/* sleep fail on busy lock */
> +#define VB_DUPOK 0x10/* permit duplicate mount busying */
>  
>  int vfs_isbusy(struct mount *);
>  int vfs_mount_foreach_vnode(struct mount *, int (*func)(struct vnode *,
> Index: sys/rwlock.h
> ===
> RCS file: /data/src/openbsd/src/sys/sys/rwlock.h,v
> retrieving revision 1.22
> diff -u -p -r1.22 rwlock.h
> --- sys/rwlock.h  12 Aug 2017 23:27:44 -  1.22
> +++ sys/rwlock.h  3 Jun 2018 17:05:32 -
> @@ -116,6 +116,7 @@ struct rwlock {
>  #define RW_SLEEPFAIL 0x0020UL /* fail if we slept for the lock */
>  #define RW_NOSLEEP   0x0040UL /* don't wait for the lock */
>  #define RW_RECURSEFAIL   0x0080UL /* Fail on recursion for RRW 
> locks. */
> +#define RW_DUPOK 0x0100UL /* Permit duplicate lock */
>  
>  /*
>   * for rw_status() and rrw_status() only: exclusive lock held by
> Index: kern/kern_rwlock.c
> ===
> RCS file: /data/src/openbsd/src/sys/kern/kern_rwlock.c,v
> retrieving revision 1.35
> diff -u -p -r1.35 kern_rwlock.c
> --- kern/kern_rwlock.c21 Mar 2018 12:28:39 -  1.35
> +++ kern/kern_rwlock.c3 Jun 2018 17:00:02 -
> @@ -223,6 +223,8 @@ _rw_enter(struct rwlock *rwl, int flags 
>   lop_flags = LOP_NEWORDER;
>   if (flags & RW_WRITE)
>   lop_flags |= LOP_EXCLUSIVE;
> + if (flags & RW_DUPOK)
> + lop_flags |= LOP_DUPOK;
>   if ((flags & RW_NOSLEEP) == 0 && (flags & RW_DOWNGRADE) == 0)
>   WITNESS_CHECKORDER(>rwl_lock_obj, lop_flags, file, line,
>   NULL);
> Index: kern/vfs_subr.c
> ===
> RCS file: /data/src/openbsd/src/sys/kern/vfs_subr.c,v
> retrieving revision 1.273
> diff -u -p -r1.273 vfs_subr.c
> --- kern/vfs_subr.c   27 May 2018 06:02:14 -  1.273
> +++ kern/vfs_subr.c   3 Jun 2018 17:04:09 -
> @@ -188,6 +188,11 @@ vfs_busy(struct mount *mp, int flags)
>   else
>   rwflags |= RW_NOSLEEP;
>  
> +#ifdef WITNESS
> + if (flags & VB_DUPOK)
> + rwflags |= RW_DUPOK;
> +#endif
> +

The other parts where you added the dup are not checking for Witness. This part 
above should be for all kernels, right? Witness or non-witness.



Re: de-hole some structs on amd64

2018-05-31 Thread Amit Kulkarni
Hi,

Is there any feedback on this? 

Thanks

> > > I tested removing some slop (i.e. structure packing/de-holing) on amd64,
> > > this went through a full kernel + userland build.
> > >
> > 
> > Parts of this are probably okay, but there's some stuff which needs better
> > placement vs comments and at least one move which needs a justification for
> > it being safe (or not).
> 
> Thanks for your feedback!
> 
> > > --- a/sys/sys/proc.h
> > > +++ b/sys/sys/proc.h
> > > @@ -170,8 +170,8 @@ struct process {
> > >
> > >  /* The following fields are all zeroed upon creation in process_new. */
> > >  #defineps_startzerops_klist
> > > -   struct  klist ps_klist; /* knotes attached to this process
> > > */
> > > int ps_flags;   /* PS_* flags. */
> > > +   struct  klist ps_klist; /* knotes attached to this process
> > > */
> > >
> > 
> > Nope: you've moved ps_flags from inside the "zeroed out on fork" region to
> > outside of it
> > a) without justifying why that's safe, and
> > b) while leaving it below the comment saying that it's zeroed, when it no
> > longer is.
> 
> My fault, I didn't read the defines properly before sending. Fixed by 
> defining ps_startzero to point to ps_flags, so it is zero'd out as before.
> 
> > 
> > Do any of the other moves here cross a start/end zero/copy marker?
> 
> Thanks for the hint. I re-checked now from the process_new() and thread_new() 
> functions in kern_fork.c. All the moves have been made within the 
> startcopy/startzero and endcopy/endzero defines in both struct proc and 
> struct process. So the memset to 0, and memcpy from parents will work as 
> before. I updated a comment to point to thread_new() function, so it is clear 
> where struct proc is inited. Please let me know if I have overlooked anything.
> 
> > 
> > > @@ -285,6 +284,7 @@ struct proc {
> > > struct  futex   *p_futex;   /* Current sleeping futex. */
> > >
> > > /* substructures: */
> > > +   LIST_ENTRY(proc) p_hash;/* Hash chain. */
> > > struct  filedesc *p_fd; /* copy of p_p->ps_fd */
> > > struct  vmspace *p_vmspace; /* copy of p_p->ps_vmspace */
> > >
> > 
> > p_hash isn't a substructure, so putting it below the /* substructures: */
> > comment is wrong.  Please pay attention to the comments and consider how
> > the apply (or don't) to the members you're moving.
> 
> Fixed.
> 
> > 
> > > @@ -305,6 +304,11 @@ struct proc {
> > > longp_thrslpid; /* for thrsleep syscall */
> > >
> > > /* scheduling */
> > > +   struct  cpu_info * volatile p_cpu; /* CPU we're running on. */
> > > +
> > > +   struct  rusage p_ru;/* Statistics */
> > > +   struct  tusage p_tu;/* accumulated times. */
> > > +   struct  timespec p_rtime;   /* Real time. */
> > > u_int   p_estcpu;/* Time averaged value of p_cpticks. */
> > > int p_cpticks;   /* Ticks of cpu time. */
> > >
> > 
> > Again, you've separated the scheduling parameter from the /* scheduling */
> > comment, putting member that aren't about scheduling between them.
> 
> Fixed. The structs rusage/tusage/timespec are not part of scheduling, so I 
> moved them before the scheduling comment.
> 
> Updated diff follows. This survived a kernel compile, reboot, and use for 
> quite some time.
> 
> 
> diff --git a/sys/sys/proc.h b/sys/sys/proc.h
> index 1c7ea4697e2..d6082cb0551 100644
> --- a/sys/sys/proc.h
> +++ b/sys/sys/proc.h
> @@ -169,9 +169,9 @@ struct process {
>   pid_t   ps_pid; /* Process identifier. */
>  
>  /* The following fields are all zeroed upon creation in process_new. */
> -#define  ps_startzerops_klist
> - struct  klist ps_klist; /* knotes attached to this process */
> +#define  ps_startzerops_flags
>   int ps_flags;   /* PS_* flags. */
> + struct  klist ps_klist; /* knotes attached to this process */
>  
>   struct  proc *ps_single;/* Single threading to this thread. */
>   int ps_singlecount; /* Not yet suspended threads. */
> @@ -200,15 +200,6 @@ struct process {
>   struct  pgrp *ps_pgrp;  /* Pointer to process group. */
>   struct  emul *ps_emul;  /* Emulation information */
>  
> - charps_comm[MAXCOMLEN+1];
> -
> - vaddr_t ps_strings; /* User pointers to argv/env */
> - vaddr_t ps_sigcode; /* User pointer to the signal code */
> - vaddr_t ps_sigcoderet;  /* User pointer to sigreturn retPC */
> - u_long  ps_sigcookie;
> - u_int   ps_rtableid;/* Process routing table/domain. */
> - charps_nice;/* Process "nice" value. */
> -
>   struct uprof {  /* profile arguments */
>   caddr_t pr_base;/* buffer base */
>   size_t  pr_size;/* buffer size */
> @@ 

Re: de-hole some structs on amd64

2018-05-22 Thread Amit Kulkarni
> > I tested removing some slop (i.e. structure packing/de-holing) on amd64,
> > this went through a full kernel + userland build.
> >
> 
> Parts of this are probably okay, but there's some stuff which needs better
> placement vs comments and at least one move which needs a justification for
> it being safe (or not).

Thanks for your feedback!

> > --- a/sys/sys/proc.h
> > +++ b/sys/sys/proc.h
> > @@ -170,8 +170,8 @@ struct process {
> >
> >  /* The following fields are all zeroed upon creation in process_new. */
> >  #defineps_startzerops_klist
> > -   struct  klist ps_klist; /* knotes attached to this process
> > */
> > int ps_flags;   /* PS_* flags. */
> > +   struct  klist ps_klist; /* knotes attached to this process
> > */
> >
> 
> Nope: you've moved ps_flags from inside the "zeroed out on fork" region to
> outside of it
> a) without justifying why that's safe, and
> b) while leaving it below the comment saying that it's zeroed, when it no
> longer is.

My fault, I didn't read the defines properly before sending. Fixed by defining 
ps_startzero to point to ps_flags, so it is zero'd out as before.

> 
> Do any of the other moves here cross a start/end zero/copy marker?

Thanks for the hint. I re-checked now from the process_new() and thread_new() 
functions in kern_fork.c. All the moves have been made within the 
startcopy/startzero and endcopy/endzero defines in both struct proc and struct 
process. So the memset to 0, and memcpy from parents will work as before. I 
updated a comment to point to thread_new() function, so it is clear where 
struct proc is inited. Please let me know if I have overlooked anything.

> 
> > @@ -285,6 +284,7 @@ struct proc {
> > struct  futex   *p_futex;   /* Current sleeping futex. */
> >
> > /* substructures: */
> > +   LIST_ENTRY(proc) p_hash;/* Hash chain. */
> > struct  filedesc *p_fd; /* copy of p_p->ps_fd */
> > struct  vmspace *p_vmspace; /* copy of p_p->ps_vmspace */
> >
> 
> p_hash isn't a substructure, so putting it below the /* substructures: */
> comment is wrong.  Please pay attention to the comments and consider how
> the apply (or don't) to the members you're moving.

Fixed.

> 
> > @@ -305,6 +304,11 @@ struct proc {
> > longp_thrslpid; /* for thrsleep syscall */
> >
> > /* scheduling */
> > +   struct  cpu_info * volatile p_cpu; /* CPU we're running on. */
> > +
> > +   struct  rusage p_ru;/* Statistics */
> > +   struct  tusage p_tu;/* accumulated times. */
> > +   struct  timespec p_rtime;   /* Real time. */
> > u_int   p_estcpu;/* Time averaged value of p_cpticks. */
> > int p_cpticks;   /* Ticks of cpu time. */
> >
> 
> Again, you've separated the scheduling parameter from the /* scheduling */
> comment, putting member that aren't about scheduling between them.

Fixed. The structs rusage/tusage/timespec are not part of scheduling, so I 
moved them before the scheduling comment.

Updated diff follows. This survived a kernel compile, reboot, and use for quite 
some time.


diff --git a/sys/sys/proc.h b/sys/sys/proc.h
index 1c7ea4697e2..d6082cb0551 100644
--- a/sys/sys/proc.h
+++ b/sys/sys/proc.h
@@ -169,9 +169,9 @@ struct process {
pid_t   ps_pid; /* Process identifier. */
 
 /* The following fields are all zeroed upon creation in process_new. */
-#defineps_startzerops_klist
-   struct  klist ps_klist; /* knotes attached to this process */
+#defineps_startzerops_flags
int ps_flags;   /* PS_* flags. */
+   struct  klist ps_klist; /* knotes attached to this process */
 
struct  proc *ps_single;/* Single threading to this thread. */
int ps_singlecount; /* Not yet suspended threads. */
@@ -200,15 +200,6 @@ struct process {
struct  pgrp *ps_pgrp;  /* Pointer to process group. */
struct  emul *ps_emul;  /* Emulation information */
 
-   charps_comm[MAXCOMLEN+1];
-
-   vaddr_t ps_strings; /* User pointers to argv/env */
-   vaddr_t ps_sigcode; /* User pointer to the signal code */
-   vaddr_t ps_sigcoderet;  /* User pointer to sigreturn retPC */
-   u_long  ps_sigcookie;
-   u_int   ps_rtableid;/* Process routing table/domain. */
-   charps_nice;/* Process "nice" value. */
-
struct uprof {  /* profile arguments */
caddr_t pr_base;/* buffer base */
size_t  pr_size;/* buffer size */
@@ -216,7 +207,15 @@ struct process {
u_int   pr_scale;   /* pc scaling */
} ps_prof;
 
+   charps_comm[MAXCOMLEN+1];
+   charps_nice;/* Process "nice" value. */
u_short 

de-hole some structs on amd64

2018-05-19 Thread Amit Kulkarni
Hi,

I tested removing some slop (i.e. structure packing/de-holing) on amd64, this 
went through a full kernel + userland build.

struct proc 20 bytes (6 places) --> 4 bytes (2 places)
struct process 28 bytes (6 places) --> 4 bytes (1 place)
struct vm_map 8 bytes (2 places) --> 0 bytes

Thanks



diff --git a/sys/sys/proc.h b/sys/sys/proc.h
index 1c7ea4697e2..d2955e2d0f7 100644
--- a/sys/sys/proc.h
+++ b/sys/sys/proc.h
@@ -170,8 +170,8 @@ struct process {
 
 /* The following fields are all zeroed upon creation in process_new. */
 #defineps_startzerops_klist
-   struct  klist ps_klist; /* knotes attached to this process */
int ps_flags;   /* PS_* flags. */
+   struct  klist ps_klist; /* knotes attached to this process */
 
struct  proc *ps_single;/* Single threading to this thread. */
int ps_singlecount; /* Not yet suspended threads. */
@@ -200,15 +200,6 @@ struct process {
struct  pgrp *ps_pgrp;  /* Pointer to process group. */
struct  emul *ps_emul;  /* Emulation information */
 
-   charps_comm[MAXCOMLEN+1];
-
-   vaddr_t ps_strings; /* User pointers to argv/env */
-   vaddr_t ps_sigcode; /* User pointer to the signal code */
-   vaddr_t ps_sigcoderet;  /* User pointer to sigreturn retPC */
-   u_long  ps_sigcookie;
-   u_int   ps_rtableid;/* Process routing table/domain. */
-   charps_nice;/* Process "nice" value. */
-
struct uprof {  /* profile arguments */
caddr_t pr_base;/* buffer base */
size_t  pr_size;/* buffer size */
@@ -216,7 +207,15 @@ struct process {
u_int   pr_scale;   /* pc scaling */
} ps_prof;
 
+   charps_comm[MAXCOMLEN+1];
+   charps_nice;/* Process "nice" value. */
u_short ps_acflag;  /* Accounting flags. */
+   u_int   ps_rtableid;/* Process routing table/domain. */
+
+   vaddr_t ps_strings; /* User pointers to argv/env */
+   vaddr_t ps_sigcode; /* User pointer to the signal code */
+   vaddr_t ps_sigcoderet;  /* User pointer to sigreturn retPC */
+   u_long  ps_sigcookie;
 
uint64_t ps_pledge;
uint64_t ps_execpledge;
@@ -285,6 +284,7 @@ struct proc {
struct  futex   *p_futex;   /* Current sleeping futex. */
 
/* substructures: */
+   LIST_ENTRY(proc) p_hash;/* Hash chain. */
struct  filedesc *p_fd; /* copy of p_p->ps_fd */
struct  vmspace *p_vmspace; /* copy of p_p->ps_vmspace */
 #definep_rlimitp_p->ps_limit->pl_rlimit
@@ -296,7 +296,6 @@ struct proc {
u_char  p_descfd;   /* if not 255, fdesc permits this fd */
 
pid_t   p_tid;  /* Thread identifier. */
-   LIST_ENTRY(proc) p_hash;/* Hash chain. */
 
 /* The following fields are all zeroed upon creation in fork. */
 #definep_startzero p_dupfd
@@ -305,6 +304,11 @@ struct proc {
longp_thrslpid; /* for thrsleep syscall */
 
/* scheduling */
+   struct  cpu_info * volatile p_cpu; /* CPU we're running on. */
+
+   struct  rusage p_ru;/* Statistics */
+   struct  tusage p_tu;/* accumulated times. */
+   struct  timespec p_rtime;   /* Real time. */
u_int   p_estcpu;/* Time averaged value of p_cpticks. */
int p_cpticks;   /* Ticks of cpu time. */
const volatile void *p_wchan;/* Sleep address. */
@@ -315,11 +319,6 @@ struct proc {
u_int   p_uticks;   /* Statclock hits in user mode. */
u_int   p_sticks;   /* Statclock hits in system mode. */
u_int   p_iticks;   /* Statclock hits processing intr. */
-   struct  cpu_info * volatile p_cpu; /* CPU we're running on. */
-
-   struct  rusage p_ru;/* Statistics */
-   struct  tusage p_tu;/* accumulated times. */
-   struct  timespec p_rtime;   /* Real time. */
 
int  p_siglist; /* Signals arrived but not delivered. */
 
diff --git a/sys/uvm/uvm_map.h b/sys/uvm/uvm_map.h
index 07ca0d0ef45..4a63463d325 100644
--- a/sys/uvm/uvm_map.h
+++ b/sys/uvm/uvm_map.h
@@ -292,16 +292,15 @@ struct vm_map {
struct pmap *   pmap;   /* Physical map */
struct rwlock   lock;   /* Lock for map data */
struct mutexmtx;
-   u_int   serial; /* signals stack changes */
-
struct uvm_map_addr addr;   /* Entry tree, by addr */
+   u_int   serial; /* signals stack changes */
 
-   vsize_t size;   /* virtual size */
int

Re: elf.h

2017-07-14 Thread Amit Kulkarni
> So with this base is compilable and runnable on AMD64. W.r.t. ports, yes, 
> some of those fails. I've not investigated more closely strange failures 
> which may be a cause of me not cleaning up fs from previous snapshots -- this 
> machine is basically 2 years old snapshots accumulation withought files 
> removal (ports failing: thunderbird, seamonkey, tor-browser, poco, pypy).

pkg_add sysclean

You might have to add some files/directories to sysclean.ignore, but
generally you can get your system to be pristine by following its
recommendations.

Thanks



Re: Better handling of short reads

2017-06-14 Thread Amit Kulkarni
On Wed, Jun 14, 2017 at 4:56 AM, Mike Belopuhov <m...@belopuhov.com> wrote:
> On Thu, Jun 08, 2017 at 11:55 +0200, Mike Belopuhov wrote:
>> On Wed, Jun 07, 2017 at 23:04 -0500, Amit Kulkarni wrote:
>> > On Wed, 7 Jun 2017 21:27:27 -0500
>> > Amit Kulkarni <amit.o...@gmail.com> wrote:
>> >
>> > > On Thu, 8 Jun 2017 01:57:25 +0200
>> > > Mike Belopuhov <m...@belopuhov.com> wrote:
>> > >
>> > > > On Wed, Jun 07, 2017 at 18:35 -0500, Amit Kulkarni wrote:
>> > > > > Wow, please get this in!!!
>> > > > >
>> > > > > This fixes cvs update on hard disks, to go much much faster. When I 
>> > > > > am
>> > > > > updating the entire set of cvs trees: www, src, xenocara, ports, I 
>> > > > > can
>> > > > > still use firefox and have it perfectly usable. There's a night and
>> > > > > day improvement, before and after. Thanks for debugging and fixing
>> > > > > this.
>> > > > >
>> > > >
>> > > > What kind of broken hardware do you have that this diff helps you?
>> > > > Can you show us your dmesg?
>> > > >
>> >
>> > Please ignore previous dmesg, it was incomplete.
>> >
>>
>> Are you 100% sure this diff changes anything for you?
>> Can you please try the one below.  It adds a printf.
>>
>
> As you all might have gathered by now Amit has jumped the gun
> but was wrong to do so.  His setup is not affected by this change.
> That was expected so please don't get distracted by this as I'm
> still looking forward to replies to the original set of changes.
> beck@?
>

Yes, my fault.

But some change has seriously improved cvs update during the timeframe
when mike posted this diff.

Thanks



Re: Better handling of short reads

2017-06-07 Thread Amit Kulkarni
On Wed, 7 Jun 2017 21:27:27 -0500
Amit Kulkarni <amit.o...@gmail.com> wrote:

> On Thu, 8 Jun 2017 01:57:25 +0200
> Mike Belopuhov <m...@belopuhov.com> wrote:
> 
> > On Wed, Jun 07, 2017 at 18:35 -0500, Amit Kulkarni wrote:
> > > Wow, please get this in!!!
> > > 
> > > This fixes cvs update on hard disks, to go much much faster. When I am
> > > updating the entire set of cvs trees: www, src, xenocara, ports, I can
> > > still use firefox and have it perfectly usable. There's a night and
> > > day improvement, before and after. Thanks for debugging and fixing
> > > this.
> > >
> > 
> > What kind of broken hardware do you have that this diff helps you?
> > Can you show us your dmesg?
> > 

Please ignore previous dmesg, it was incomplete.

OpenBSD 6.1-current (GENERIC.MP) #0: Wed Jun  7 18:11:29 CDT 2017
a...@pilloo-saru.my.domain:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4183433216 (3989MB)
avail mem = 4050853888 (3863MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xecc70 (82 entries)
bios0: vendor LENOVO version "M0KKT23A" date 02/19/2016
bios0: LENOVO 10HS007RUS
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT DBGP SLIC MSDM SSDT SSDT MCFG HPET SSDT SSDT 
BGRT LUFT
acpi0: wakeup devices PXSX(S4) RP01(S4) PXSX(S4) PXSX(S4) RP03(S4) PXSX(S4) 
PXSX(S4) RP05(S4) PXSX(S4) PXSX(S4) PXSX(S4) GLAN(S4) EHC1(S3) EHC2(S3) 
XHC_(S3) HDEF(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, 3692.03 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: TSC frequency 3692032960 Hz
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, 3691.46 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 1 (application processor)
cpu2: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, 3691.46 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 1, core 0, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, 3691.46 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 1, core 1, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0 addr 0xf800, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (RP01)
acpiprt2 at acpi0: bus 2 (RP03)
acpiprt3 at acpi0: bus 3 (RP05)
acpiprt4 at acpi0: bus -1 (PEG0)
acpiprt5 at acpi0: bus -1 (PEG1)
acpiprt6 at acpi0: bus -1 (PEG2)
acpiec0 at acpi0: not present
acpicpu0 at acpi0: C2(200@148 mwait.1@0x31), C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C2(200@148 mwait.1@0x31), C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C2(200@148 mwait.1@0x31), C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C2(200@148 mwait.1@0x31), C1(1000@1 mwait.1), PSS
acpipwrres0 at acpi0: FN00, resource for FAN0
acpipwrres1 at acpi0: FN01, resource for FAN1
acpipwrres2 at acpi0: FN02, resource for FAN2
acpipwrres3 at acpi0: FN03, resource for FAN3
acpipwrres4 at acpi0: FN04, resource for FAN4
acpitz0 at acpi0: critical temperature is 105 degC
acpitz1 a

Re: Better handling of short reads

2017-06-07 Thread Amit Kulkarni
On Thu, 8 Jun 2017 01:57:25 +0200
Mike Belopuhov <m...@belopuhov.com> wrote:

> On Wed, Jun 07, 2017 at 18:35 -0500, Amit Kulkarni wrote:
> > Wow, please get this in!!!
> > 
> > This fixes cvs update on hard disks, to go much much faster. When I am
> > updating the entire set of cvs trees: www, src, xenocara, ports, I can
> > still use firefox and have it perfectly usable. There's a night and
> > day improvement, before and after. Thanks for debugging and fixing
> > this.
> >
> 
> What kind of broken hardware do you have that this diff helps you?
> Can you show us your dmesg?
> 


OpenBSD 6.1-current (GENERIC.MP) #0: Wed Jun  7 18:11:29 CDT 2017
a...@pilloo-saru.my.domain:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4183433216 (3989MB)
avail mem = 4050853888 (3863MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xecc70 (82 entries)
bios0: vendor LENOVO version "M0KKT23A" date 02/19/2016
bios0: LENOVO 10HS007RUS
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT DBGP SLIC MSDM SSDT SSDT MCFG HPET SSDT SSDT 
BGRT LUFT
acpi0: wakeup devices PXSX(S4) RP01(S4) PXSX(S4) PXSX(S4) RP03(S4) PXSX(S4) 
PXSX(S4) RP05(S4) PXSX(S4) PXSX(S4) PXSX(S4) GLAN(S4) EHC1(S3) EHC2(S3) 
XHC_(S3) HDEF(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, 3691.97 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: TSC frequency 3691968320 Hz
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, 3691.45 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 1 (application processor)
cpu2: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, 3691.45 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 1, core 0, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, 3691.45 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 1, core 1, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0 addr 0xf800, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (RP01)
acpiprt2 at acpi0: bus 2 (RP03)
acpiprt3 at acpi0: bus 3 (RP05)
acpiprt4 at acpi0: bus -1 (PEG0)
acpiprt5 at acpi0: bus -1 (PEG1)
acpiprt6 at acpi0: bus -1 (PEG2)
acpiec0 at acpi0: not present
acpicpu0 at acpi0: C2(200@148 mwait.1@0x31), C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C2(200@148 mwait.1@0x31), C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C2(200@148 mwait.1@0x31), C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C2(200@148 mwait.1@0x31), C1(1000@1 mwait.1), PSS
acpipwrres0 at acpi0: FN00, resource for FAN0
acpipwrres1 at acpi0: FN01, resource for FAN1
acpipwrres2 at acpi0: FN02, resource for FAN2
acpipwrres3 at acpi0: FN03, resource for FAN3
acpipwrres4 at acpi0: FN04, resource for FAN4
acpitz0 at acpi0: critical temperature is 105 degC
acpitz1 at acpi0: critical temperature is 105 degC
"INT3F0D" at acpi0 not configured
"PNP0501" at acpi0 not configured
acpibtn0 at acpi0: PWRB
"PNP0C0B" at acpi0 not configured
"PNP0C

Re: Better handling of short reads

2017-06-07 Thread Amit Kulkarni
Wow, please get this in!!!

This fixes cvs update on hard disks, to go much much faster. When I am
updating the entire set of cvs trees: www, src, xenocara, ports, I can
still use firefox and have it perfectly usable. There's a night and
day improvement, before and after. Thanks for debugging and fixing
this.

amit

On Wed, Jun 7, 2017 at 12:29 PM, Mike Belopuhov  wrote:
> Hi,
>
> I've discovered that short reads (nonzero b_resid) aren't
> handled very well in our kernel and I've proposed a diff
> like this to handle short reads of buffercache read-ahead
> buffers:
>
> diff --git sys/kern/vfs_bio.c sys/kern/vfs_bio.c
> index 95bc80bc0e6..1cc1943d752 100644
> --- sys/kern/vfs_bio.c
> +++ sys/kern/vfs_bio.c
> @@ -534,11 +534,27 @@ bread_cluster_callback(struct buf *bp)
>  */
> buf_fix_mapping(bp, newsize);
> bp->b_bcount = newsize;
> }
>
> -   for (i = 1; xbpp[i] != 0; i++) {
> +   /* Invalidate read-ahead buffers if read short */
> +   if (bp->b_resid > 0) {
> +   for (i = 0; xbpp[i] != NULL; i++)
> +   continue;
> +   for (i = i - 1; i != 0; i--) {
> +   if (xbpp[i]->b_bufsize <= bp->b_resid) {
> +   bp->b_resid -= xbpp[i]->b_bufsize;
> +   SET(xbpp[i]->b_flags, B_INVAL);
> +   } else if (bp->b_resid > 0) {
> +   bp->b_resid = 0;
> +   SET(xbpp[i]->b_flags, B_INVAL);
> +   } else
> +   break;
> +   }
> +   }
> +
> +   for (i = 1; xbpp[i] != NULL; i++) {
> if (ISSET(bp->b_flags, B_ERROR))
> SET(xbpp[i]->b_flags, B_INVAL | B_ERROR);
> biodone(xbpp[i]);
> }
>
>
> Now I said before that the only issue that this diff didn't
> fix was with the xbpp[0] aka the buf we return to FFS: if we
> have a 64k sized cluster on our filesystem then we've never
> created read-ahead bufs and thus this code never runs and we
> never account for the b_resid.  However, this is thankfully
> not correct as FFS handles short reads itself (except one
> small detail...). Here's a chunk from ffs_read:
>
> if (lblktosize(fs, nextlbn) >= DIP(ip, size))
> error = bread(vp, lbn, size, );
> else
> error = bread_cluster(vp, lbn, size, );
>
> if (error)
> break;
>
> /*
>  * We should only get non-zero b_resid when an I/O error
>  * has occurred, which should cause us to break above.
>  * However, if the short read did not cause an error,
>  * then we want to ensure that we do not uiomove bad
>  * or uninitialized data.
>  */
> size -= bp->b_resid;
> if (size < xfersize) {
> if (size == 0)
> break;
> xfersize = size;
> }
> error = uiomove(bp->b_data + blkoffset, xfersize, uio);
>
> As you can see it copies (size - bp->b_resid) into the uio.
> That would be OK if b_resid was as large as the 'size'. But
> due to how bread_cluster extends the b_count to cover for
> all additional read-ahead buffers, the transfer in the end
> can have a b_resid anywhere in the interval of [0, MAXPHYS]
> which can be larger than 'size' that FFS has asked for.
>
> This leads to 'size' underflow because it's an integer and
> then uiomove gets a negative value for xfersize which gets
> converted to a very large unsigned long (size_t) parameter
> for uiomove. And this is bad.  Therefore, additionally I'd
> like to assert this in the FFS code itself.  If this is the
> way to go, I'll look into other filesystems and propose a
> similar check.
>
> diff --git sys/ufs/ffs/ffs_vnops.c sys/ufs/ffs/ffs_vnops.c
> index 160e187820f..56c222612a2 100644
> --- sys/ufs/ffs/ffs_vnops.c
> +++ sys/ufs/ffs/ffs_vnops.c
> @@ -244,10 +244,11 @@ ffs_read(void *v)
>  * has occurred, which should cause us to break above.
>  * However, if the short read did not cause an error,
>  * then we want to ensure that we do not uiomove bad
>  * or uninitialized data.
>  */
> +   KASSERT(bp->b_resid <= size);
> size -= bp->b_resid;
> if (size < xfersize) {
> if (size == 0)
> break;
> xfersize = size;
>
>
> So to make it clear: I'd like to commit both changes and
> if that's something we agree upon, I'll look into other
> filesystems and make sure that they implement similar
> assertions.
>
> Opinions?
>



Remove KGDB leftover in i386/amd64 GENERIC

2017-04-30 Thread Amit Kulkarni


Index: amd64/conf/GENERIC
===
RCS file: /cvs/src/sys/arch/amd64/conf/GENERIC,v
retrieving revision 1.442
diff -u -p -u -p -r1.442 GENERIC
--- amd64/conf/GENERIC  12 Mar 2017 21:31:18 -  1.442
+++ amd64/conf/GENERIC  30 Apr 2017 19:53:15 -
@@ -19,9 +19,6 @@ optionUSER_PCICONF# user-space PCI co
 option APERTURE# in-kernel aperture driver for XFree86
 option MTRR# CPU memory range attributes control
 
-#optionKGDB# Remote debugger support; exclusive of 
DDB
-#option"KGDB_DEVNAME=\"com\"",KGDBADDR=0x2f8,KGDBRATE=9600
-
 option NTFS# NTFS support
 option HIBERNATE   # Hibernate support
 
Index: i386/conf/GENERIC
===
RCS file: /cvs/src/sys/arch/i386/conf/GENERIC,v
retrieving revision 1.826
diff -u -p -u -p -r1.826 GENERIC
--- i386/conf/GENERIC   13 Jan 2017 20:30:59 -  1.826
+++ i386/conf/GENERIC   30 Apr 2017 19:53:15 -
@@ -20,9 +20,6 @@ optionKVM86   # Kernel Virtual 8086 
emu
 option APERTURE# in-kernel aperture driver for XFree86
 option MTRR# CPU memory range attributes control
 
-#optionKGDB# Remote debugger support; exclusive of 
DDB
-#option"KGDB_DEVNAME=\"com\"",KGDBADDR=0x2f8,KGDBRATE=9600
-
 option NTFS# NTFS support
 option HIBERNATE   # Hibernate support
 



remove cpuset dead code

2017-01-31 Thread Amit Kulkarni
Unused under /sys, survived a kernel build without problems.



Index: kern/kern_sched.c
===
RCS file: /cvs/src/sys/kern/kern_sched.c,v
retrieving revision 1.44
diff -u -p -u -p -r1.44 kern_sched.c
--- kern/kern_sched.c   21 Jan 2017 05:42:03 -  1.44
+++ kern/kern_sched.c   1 Feb 2017 04:04:37 -
@@ -721,12 +721,6 @@ cpuset_init_cpu(struct cpu_info *ci)
 }
 
 void
-cpuset_clear(struct cpuset *cs)
-{
-   memset(cs, 0, sizeof(*cs));
-}
-
-void
 cpuset_add(struct cpuset *cs, struct cpu_info *ci)
 {
unsigned int num = CPU_INFO_UNIT(ci);
@@ -748,12 +742,6 @@ cpuset_isset(struct cpuset *cs, struct c
 }
 
 void
-cpuset_add_all(struct cpuset *cs)
-{
-   cpuset_copy(cs, _all);
-}
-
-void
 cpuset_copy(struct cpuset *to, struct cpuset *from)
 {
memcpy(to, from, sizeof(*to));
@@ -769,15 +757,6 @@ cpuset_first(struct cpuset *cs)
return (cpuset_infos[i * 32 + ffs(cs->cs_set[i]) - 1]);
 
return (NULL);
-}
-
-void
-cpuset_union(struct cpuset *to, struct cpuset *a, struct cpuset *b)
-{
-   int i;
-
-   for (i = 0; i < CPUSET_ASIZE(ncpus); i++)
-   to->cs_set[i] = a->cs_set[i] | b->cs_set[i];
 }
 
 void
Index: sys/proc.h
===
RCS file: /cvs/src/sys/sys/proc.h,v
retrieving revision 1.232
diff -u -p -u -p -r1.232 proc.h
--- sys/proc.h  31 Jan 2017 07:44:55 -  1.232
+++ sys/proc.h  1 Feb 2017 04:04:37 -
@@ -581,13 +581,10 @@ struct cpuset {
 
 void cpuset_init_cpu(struct cpu_info *);
 
-void cpuset_clear(struct cpuset *);
 void cpuset_add(struct cpuset *, struct cpu_info *);
 void cpuset_del(struct cpuset *, struct cpu_info *);
 int cpuset_isset(struct cpuset *, struct cpu_info *);
-void cpuset_add_all(struct cpuset *);
 void cpuset_copy(struct cpuset *, struct cpuset *);
-void cpuset_union(struct cpuset *, struct cpuset *, struct cpuset *);
 void cpuset_intersection(struct cpuset *t, struct cpuset *, struct cpuset *);
 void cpuset_complement(struct cpuset *, struct cpuset *, struct cpuset *);
 struct cpu_info *cpuset_first(struct cpuset *);


cpuset.diff
Description: Binary data


Re: Futexes for OpenBSD

2016-09-02 Thread Amit Kulkarni
The new files should have the ISC/OpenBSD license inserted at the top.

IMHO, the ticket changes are a separate diff and you will be able to commit
that part first.

Also, replacing struct _spinlock with int is also a separate diff.

Those changes don't clash with futexes, so less chance of being reverted.

Thanks


On Fri, Sep 2, 2016 at 8:36 AM, Michal Mazurek  wrote:

> Here is a working futex implementation for OpenBSD. This diff touches
> the kernel and librthread.
>
> * get rid of tickets from rthreads, they were getting in the way and are
> unused anyway
> * replace all struct _spinlock with int
> * use futexes instead of spinlocks everywhere within librthread
> * librthread no longer calls sched_yield(), nor does it spin
>
> Any comments?
>
> Index: lib/librthread/Makefile
> ===
> RCS file: /cvs/src/lib/librthread/Makefile,v
> retrieving revision 1.43
> diff -u -p -r1.43 Makefile
> --- lib/librthread/Makefile 1 Jun 2016 04:34:18 -   1.43
> +++ lib/librthread/Makefile 2 Sep 2016 13:09:44 -
> @@ -18,7 +18,8 @@ CFLAGS+=-DNO_PIC
>  VERSION_SCRIPT= ${.CURDIR}/Symbols.map
>
>  .PATH: ${.CURDIR}/arch/${MACHINE_CPU}
> -SRCS=  rthread.c \
> +SRCS=  futex.c \
> +   rthread.c \
> rthread_attr.c \
> rthread_barrier.c \
> rthread_barrier_attr.c \
> Index: lib/librthread/futex.c
> ===
> RCS file: lib/librthread/futex.c
> diff -N lib/librthread/futex.c
> --- /dev/null   1 Jan 1970 00:00:00 -
> +++ lib/librthread/futex.c  2 Sep 2016 13:09:44 -
> @@ -0,0 +1,41 @@
> +#include 
> +
> +#include 
> +#include "thread_private.h"
> +#include "rthread.h"
> +
> +inline int
> +futex_lock(volatile int *val)
> +{
> +   int c;
> +
> +   if ((c = __sync_val_compare_and_swap(val, 0, 1)) != 0) {
> +   do {
> +   if (c == 2 || __sync_val_compare_and_swap(val, 1,
> 2) != 0) {
> +   futex(val, FUTEX_WAIT, 2, NULL, NULL, 0);
> +   }
> +   } while ((c = __sync_val_compare_and_swap(val, 0, 2)) !=
> 0);
> +   }
> +
> +   return 0;
> +}
> +
> +inline int
> +futex_trylock(volatile int *val)
> +{
> +   if ((__sync_val_compare_and_swap(val, 0, 1)) != 0)
> +   return 1;
> +
> +   return 0;
> +}
> +
> +inline int
> +futex_unlock(volatile int *val)
> +{
> +   if (__sync_sub_and_fetch(val, 1) != 0) {
> +   *val = 0;
> +   futex(val, FUTEX_WAKE, 1, NULL, NULL, 0);
> +   }
> +
> +   return 0;
> +}
> Index: lib/librthread/rthread.c
> ===
> RCS file: /cvs/src/lib/librthread/rthread.c,v
> retrieving revision 1.92
> diff -u -p -r1.92 rthread.c
> --- lib/librthread/rthread.c1 Sep 2016 10:41:02 -   1.92
> +++ lib/librthread/rthread.c2 Sep 2016 13:09:44 -
> @@ -63,15 +63,15 @@ REDIRECT_SYSCALL(thrkill);
>
>  static int concurrency_level;  /* not used */
>
> -struct _spinlock _SPINLOCK_UNLOCKED_ASSIGN = _SPINLOCK_UNLOCKED;
> +int _SPINLOCK_UNLOCKED_ASSIGN = _SPINLOCK_UNLOCKED;
>
>  int _threads_ready;
>  size_t _thread_pagesize;
>  struct listhead _thread_list = LIST_HEAD_INITIALIZER(_thread_list);
> -struct _spinlock _thread_lock = _SPINLOCK_UNLOCKED;
> +int _thread_lock = 0;
>  static struct pthread_queue _thread_gc_list
>  = TAILQ_HEAD_INITIALIZER(_thread_gc_list);
> -static struct _spinlock _thread_gc_lock = _SPINLOCK_UNLOCKED;
> +int _thread_gc_lock = 0;
>  static struct pthread _initial_thread;
>
>  struct pthread_attr _rthread_attr_default = {
> @@ -88,23 +88,22 @@ struct pthread_attr _rthread_attr_defaul
>  /*
>   * internal support functions
>   */
> -void
> -_spinlock(volatile struct _spinlock *lock)
> +inline void
> +_spinlock(volatile int *lock)
>  {
> -   while (_atomic_lock(>ticket))
> -   sched_yield();
> +   futex_lock(lock);
>  }
>
> -int
> -_spinlocktry(volatile struct _spinlock *lock)
> +inline int
> +_spinlocktry(volatile int *lock)
>  {
> -   return 0 == _atomic_lock(>ticket);
> +   return 0 == futex_trylock(lock);
>  }
>
> -void
> -_spinunlock(volatile struct _spinlock *lock)
> +inline void
> +_spinunlock(volatile int *lock)
>  {
> -   lock->ticket = _ATOMIC_LOCK_UNLOCKED;
> +   futex_unlock(lock);
>  }
>
>  static void
> @@ -643,7 +642,7 @@ _thread_dump_info(void)
>  void
>  _rthread_dl_lock(int what)
>  {
> -   static struct _spinlock lock = _SPINLOCK_UNLOCKED;
> +   static int lock = _SPINLOCK_UNLOCKED;
> static pthread_t owner = NULL;
> static struct pthread_queue lockers =
> TAILQ_HEAD_INITIALIZER(lockers);
> static int count = 0;
> @@ -658,8 +657,7 @@ _rthread_dl_lock(int what)
> } else if (owner != self) {
> TAILQ_INSERT_TAIL(, self, waiting);
>

Re: Not enough inodes on /usr for ports/xenocara

2016-06-14 Thread Amit Kulkarni
On Tue, Jun 14, 2016 at 8:41 AM, Marc Peters  wrote:

> Hi,
>
> i just did an installation of a HP DL360 Gen9 to test UEFI installations
> with 5.9. I just accepted most of the defaults and did this for the
> disklabels, too. I wanted to checkout the sources to do release builds,
> but the autolayout didn't create enough inodes to do so:
>
> [snip]
> U xenocara/xserver/composite/Makefile.am
> U xenocara/xserver/composite/Makefile.in
> U xenocara/xserver/composite/compalloc.c
> U xenocara/xserver/composite/compext.c
> U xenocara/xserver/composite/compinit.c
> U xenocara/xserver/composite/compint.h
> U xenocara/xserver/composite/compositeext.h
> U xenocara/xserver/composite/compoverlay.c
> U xenocara/xserver/composite/compwindow.c
>
> /usr: create/symlink failed, no inodes free
> cvs [checkout aborted]: cannot open
> xenocara/xserver/config/CVS/Repository: No space left on device
>
> The layout is:
>
> /usr # df -ikl
> Filesystem  1K-blocks  Used Avail Capacity iused   ifree  %iused
>  Mounted on
> /dev/sd0a 1028878 53078924358 5%1767  154135 1%   /
> /dev/sd0l   202395708 4 192275920 0%   1 12794877 0%
>   /home
> /dev/sd0d 412539010   3919112 0%   6  545656 0%
>   /tmp
> /dev/sd0f 2061054   157808437991881%  285822   0   100%
>   /usr
> /dev/sd0g 102887821466076277622%9218  146684 6%
>   /usr/X11R6
> /dev/sd0h10318462 26502   9776038 0%1820 1323362 0%
>   /usr/local
> /dev/sd0k 2061054 2   1958000 0%   1  285821 0%
>   /usr/obj
> /dev/sd0j 2061054805962   115204041%  110019  17580338%
>   /usr/src
> /dev/sd0e36618012  5154  34781958 0% 186 4676932 0%
>   /var
>
>
/usr/{src/obj} is usually allocated 2 GB, you have the same 2 GB allocated
for /usr (df -h might be friendlier to see and confirm). I usually mount
/usr/ports with 100 GB (thinking of splitting it some more), /usr/xenocara
with 1-2 GB, and /usr/xobj with 1-2 GB.

If this is a newly installed machine and you don't want to fiddle with it,
you can symlink /usr/xenocara to /home/marc/. But I would re-install with
the correct layout. It usually takes a few tries to get it right for your
workflow.

Good luck!


Requesting /usr/bin/doas to be moved to /bin

2016-04-14 Thread Amit Kulkarni
The title says it all. Is there any interest in moving doas so that it can 
be statically compiled in without depending on libc.so?

I am also surprised su and login are in /usr/bin. A few minutes ago, 
I mistakenly deleted libc.so. I had several shells open but no root shell 
:(. I recovered using bsd.rd

I think su/login are old utils and their paths are written in stone due to 
many scripts out there in the wild, but doas is newer so it might be 
considered?

Thanks



Options DDB/DIAGNOSTIC

2016-04-05 Thread Amit Kulkarni
Hello,

There is a bunch of code which is always on, but ifdef'd DDB/DIAGNOSTIC. 
Those options have been turned on forever in GENERIC. Any interest in 
removing code like this? The ifdef/endif DIAGNOSTIC will just be removed 
by unifdef...

If there is any interest, I will go file by file and send diffs.

Thanks

#ifdef DIAGNOSTIC
..
//code to be kept
..
#else
..
//code to remove
..
#endif



Re: Scheduler hack for multi-threaded processes

2016-03-22 Thread Amit Kulkarni
On Tue, Mar 22, 2016 at 10:00 AM, Douglas Ray <doug...@cpan.org> wrote:

> On 21/03/16 11:29 AM, Mark Kettenis wrote:
>
>> From: Amit Kulkarni <amitk...@gmail.com>
>>> Date: Sun, 20 Mar 2016 17:57:49 -0500
>>>
>> ...
>
>> +1. Previously, when I did a cvs update with original scheduler code,
>>> doing
>>> the ports update the machine always froze solid while doing cvs update,
>>> taking 3 minutes to recover. This time with Martin's patch, the freezing
>>> period seems to have decreased quite a bit, although the freeze still
>>> happens. Stefan's amap diff and Bob's VFS/UVM diff also seems to have a
>>> made a difference.
>>>
>>> Pentium G2020 2.9GHz dual core Ivy bridge 22nm... 8 GB RAM
>>>
>>> IMHO, this patch should go in!
>>>
>>
>> No.  It's a hack. It points out aproblem that should be investigated
>> deeper.
>>
>>
> If it gives a significant performance improvement but is too distant
> from a real solution, maybe it could be worth distributing in
> package(s) form.
>
> The team then has the option to actively promote it, or not; and
> the package could be updated as new refinements are found.
>
>
I am not sure if you know how the OpenBSD team operates. They call a hack a
hack and will do it right. I am thankful that Michal pointed out a
deficiency in our current implementation, and Martin reduced the test case.
That will help to figure out where the problem(s) is/are. To repeat the
team will do it right.

Thanks


Re: Scheduler hack for multi-threaded processes

2016-03-20 Thread Amit Kulkarni
On Sat, Mar 19, 2016 at 4:35 PM, Bob  wrote:

>
> I'm also interested in hearing from more people using multi-threaded
>> applications.
>>
>
> I applied the patch to my old core duo p7570 running CURRENT/amd64.
> Firefox is more responsive and youtube videos, previously impossible to
> watch, run smoothly, even in full screen. Sumultaneously did some photo
> editing work in gimp, ran gigabyte sized file transfers with sftp and took
> a CVS update on ports without any hiccups. Browsing in Firefox remained
> smooth although I heard a few glitchs in audio playback when scrolling
> while a busy page was still loading, with load average hovering around 2.5
> with 58 processes and 138 threads at the time. Otherwise smooth sailing,
> very nice.
>
> Bob
>
>
+1. Previously, when I did a cvs update with original scheduler code, doing
the ports update the machine always froze solid while doing cvs update,
taking 3 minutes to recover. This time with Martin's patch, the freezing
period seems to have decreased quite a bit, although the freeze still
happens. Stefan's amap diff and Bob's VFS/UVM diff also seems to have a
made a difference.

Pentium G2020 2.9GHz dual core Ivy bridge 22nm... 8 GB RAM

IMHO, this patch should go in!

Thanks


Re: New scheduler for OpenBSD

2016-03-13 Thread Amit Kulkarni
On Sat, Mar 12, 2016 at 10:36 AM, Michal Mazurek 
wrote:

> Gregor Best attempted to improve the scheduler in 2011:
> http://comments.gmane.org/gmane.os.openbsd.tech/27059
> Here is another attempt, it takes up where the previous one left off.
>
> This is also mostly based on the main idea behind Linux CFS or
> BFS. I found BFS to be described more clearly:
> http://ck.kolivas.org/patches/bfs/4.0/4.3/4.3-sched-bfs-467.patch
>
>
> Some notes:
>
> Chrome is still not very usable.
>
> Much more work is needed, e.g. there is some MD code on sparc64 and
> alpha that depends on spc_schedticks that needs to be understood and
> rewritten.
>
> Maybe using RB trees to queue what is usually no more than 5 elements
> is overkill.
>
> p_usrpri and p_priority will go away, so userland utilities like 'ps'
> will need to be changed.
>
> I also want to try and see if implementing one shared queue, instead of
> keeping one queue per cpu will improve performance even further. Right
> now there are some heuristics to determine whether a process should
> switch cpus. This doesn't work very well yet, in my tests with the
> attached code sometimes one queue was a second behind another. From
> what I understand that's the idea behind BFS and the reason why it
> doesn't scale to 4096 CPUs. I see that OpenBSD supports 256 CPUs on
> sparc64:
> ./arch/sparc64/include/cpu.h:#define MAXCPUS256
>
>
Hi Michal,
One shared queue is bad when # of CPU goes on increasing as it effectively
trends towards a global lock.
Thanks


mention /etc/doas.conf instead of plain doas.conf

2015-12-31 Thread Amit Kulkarni
I just switched from sudo to doas and was stumped by this.

The doas code expects doas.conf in /etc yet the manpage does not explicitly
make that clear. I added a SYNOPSIS like in "man login.conf".

Thanks

Index: doas.conf.5
===
RCS file: /cvs/src/usr.bin/doas/doas.conf.5,v
retrieving revision 1.16
diff -u -p -u -p -r1.16 doas.conf.5
--- doas.conf.5 1 Sep 2015 13:20:53 -   1.16
+++ doas.conf.5 31 Dec 2015 23:39:23 -
@@ -19,12 +19,14 @@
 .Sh NAME
 .Nm doas.conf
 .Nd doas configuration file
+.Sh SYNOPSIS
+.Nm /etc/doas.conf
 .Sh DESCRIPTION
 The
 .Xr doas 1
 utility executes commands as other users according to the rules
-in the
-.Nm
+in
+.Nm /etc/doas.conf
 configuration file.
 .Pp
 The rules have the following format:
Index: doas.conf.5
===
RCS file: /cvs/src/usr.bin/doas/doas.conf.5,v
retrieving revision 1.16
diff -u -p -u -p -r1.16 doas.conf.5
--- doas.conf.5 1 Sep 2015 13:20:53 -   1.16
+++ doas.conf.5 31 Dec 2015 23:32:22 -
@@ -19,12 +19,14 @@
 .Sh NAME
 .Nm doas.conf
 .Nd doas configuration file
+.Sh SYNOPSIS
+.Nm /etc/doas.conf
 .Sh DESCRIPTION
 The
 .Xr doas 1
 utility executes commands as other users according to the rules
-in the
-.Nm
+in
+.Nm /etc/doas.conf
 configuration file.
 .Pp
 The rules have the following format:


Re: printf(3) wording

2015-11-17 Thread Amit Kulkarni
On Tue, Nov 17, 2015 at 11:22 AM, Jason McIntyre  wrote:

> On Tue, Nov 17, 2015 at 06:14:33PM +0100, Jan Stary wrote:
> > On Nov 17 17:06:11, j...@kerhand.co.uk wrote:
> > > On Tue, Nov 17, 2015 at 10:38:41AM +0100, Jan Stary wrote:
> > > > I am not a native speaker, but the conversion specifiers
> > > > are "interpreted" by printf, not "interpolated", right?
> > > >
> > > >   Jan
> > > >
> > >
> > > i don;t know how these implementations work, so it's hard to say.
> > > perhaps they are interpolated. maybe use cvs to track down the author
> > > and ask them?
> > >
> > > whatever the outcome, if you want to change this text you probably want
> > > to adjust a few more:
> > >
> > > /usr/src/lib/libc/gen/err.3:for later interpolation by the
> > > /usr/src/lib/libc/gen/setproctitle.3:for later interpolation by
> > > /usr/src/lib/libc/gen/syslog.3:for later interpolation by
> > > /usr/src/lib/libc/stdio/printf.3:for later interpolation by
> >
> > Hm, probably just my English;
> > sorry for the noise.
> >
> >   Jan
> >
>
> not neccessarily. the author may have been confused too. who knows. i
> had to look up "interpolation" myself. it's a word i'd prefer to avoid
> in man pages if we can ;)
>
>
jan is right, interpretation is the correct word. Interpolate is using
something to do mathematically. the program is going to work on, so
interpret...


Re: sqlite 3.8.11.1

2015-09-09 Thread Amit Kulkarni
On Wed, Sep 9, 2015 at 11:12 AM, Miod Vallat  wrote:

> > > When espie@ imported sqlite he wanted to follow upstream so he
> imported
> > > what was distrubuted with sqlite. Since then we do tagged (based on the
> > > sqlite version) imports whenever we do an update. So when a diff is
> sent
> > > out it includes all new files in that sqlite release. In this case
> there
> > > is a new fts5 backend which contains a lot of tests (which we never
> > > run). We also haven't enabled the fts5 backend at this time.
> > >
>

AFAIK, the original rationale for importing sqlite into base was for
storing the database table (INDEX?) for building ports using dpb. It can be
switched to a port module with some pains.


Re: apmd hangs

2014-09-09 Thread Amit Kulkarni
On Tue, Sep 9, 2014 at 2:13 PM, David Coppa dco...@gmail.com wrote:

 On Tue, Sep 9, 2014 at 7:58 PM, Ingo Schwarze schwa...@usta.de wrote:
  Hi David,
 
  David Coppa wrote on Tue, Sep 09, 2014 at 07:44:47PM +0200:
  On Tue, Sep 9, 2014 at 7:27 PM, Ingo Schwarze schwa...@usta.de wrote:
 
  i'm sorry to say it makes no difference for me (i'm not opposed to the
  diff, though).
 
  On my laptop, building ports works fine, running firefox works fine,
  but whenever i surf the web with firefox while building ports,
  the machine locks up hard.  Sometimes, the lockup already happens
  when merely starting firefox while building ports.  Often, it
  happens not when requesting a new URI, but when merely scrolling
  within the page in firefox.
 
  After the lockup, CapsLk and NmLk still toggle the respective LEDs,
  Fn-PgUp still switches on and off the torch, but nothing else has
  any effect, not even Ctrl-Alt-Esc, Ctrl-Alt-Delete, Ctrl-Alt-Backspace
  or Ctrl-Alt-F1.
 
  Unfortunately, i cannot break into ddb because i don't have a
  docking station, hence no serial console, and when going to the
  PC virtual console (Ctrl-Alt-F1), setting export DISPLAY=:0,
  and starting firefox from the console, i was unable to get any
  lockup.  Apparently, it only happens when X (or whatever) is
  actually painting something onto the screen.
 
  Whether i run with the defaults or with apm -A doesn't appear to
  make a difference.
 
  I'm a bit confused... Is this hang happening without apmd running?
 
  Yes.  That doesn't make a difference, either.
 
  Usually, i run with apmd in default mode:
 
ischwarze@isnote $ grep apm /etc/rc.conf.local
apmd_flags=
 
  But with apmd_flags=-A or apmd_flags=NO the hangs happen in
  exactly the same way.

 So I'm with Mark here, I also think your hang is unrelated to this diff.



+1

Ingo,
A basic rule of thumb when building ports: raise your /etc/login.conf
limits...especially datasize-cur needs to be 2G and datasize-max needs to
be 3G. The reason being there are some ports where the linker blows up to
2G or slightly over. The worst offenders are usually the www/webkit or
chrome or firefox. Though py-py also takes a lot of memory.

There is also another well-known bug in the I/O path which espie@ referred
to a few months ago. But it is as yet undetected? It rears its ugly head
when your machine does a lot of I/O. Try running cvsync, building ports,
run a find/grep over ports tree, and try to browse with firefox all at the
same time. The system feels as if it goes into a hang. But give it a few
seconds and it comes back normally. Is this what is happening with you?


Re: lynx: disable old protocols

2014-07-16 Thread Amit Kulkarni
On Wed, Jul 16, 2014 at 4:00 PM, Shawn K. Quinn skqu...@rushpost.com
wrote:

 On Wed, 2014-07-16 at 13:56 -0500, patric conant wrote:
  I'd also like to point out that Shawn has broken the social contract
  here, it's well known that it's generally considered rude to direct
  developers, in this forum.

 Every single free or open-source software project I have ever used has
 been shaped by user feedback. Most take it seriously when users say they
 still use functionality that's being slated for removal. So Patric, you
 can take this social contract of yours and shove it up your ass. I
 don't recognize it as anything but toilet paper.


And the ports devs did listen ***seriously***. bcallah@ provided an initial
port and sthen@ gave some feedback. It might make it into the ports
tree.Are you not subscribed to ports@? Lynx is probably just a pkg_add
away. Or if that effort is abandoned, you can whip up your own port based
on bcallah@ initial port.

This project is also shaped by user feedback. Otherwise, those two wouldn't
have bothered wasting their time on lynx.


remove obsolete variables NKMEMPAGES_MIN/ NKMEMPAGES_MAX

2013-07-07 Thread Amit Kulkarni


Index: regress/usr.bin/diff/t8.2
===
RCS file: /cvs/src/regress/usr.bin/diff/t8.2,v
retrieving revision 1.1
diff -u -p -r1.1 t8.2
--- regress/usr.bin/diff/t8.2   17 Jul 2003 21:04:04 -  1.1
+++ regress/usr.bin/diff/t8.2   7 Jul 2013 21:54:34 -
@@ -58,18 +58,6 @@ struct vm_map *kmem_map = NULL;
 #endif
 intnkmempages = NKMEMPAGES;
 
-/*
- * Defaults for lower- and upper-bounds for the kmem_map page count.
- * Can be overridden by kernel config options.
- */
-#ifndefNKMEMPAGES_MIN
-#defineNKMEMPAGES_MIN  NKMEMPAGES_MIN_DEFAULT
-#endif
-
-#ifndef NKMEMPAGES_MAX
-#defineNKMEMPAGES_MAX  NKMEMPAGES_MAX_DEFAULT
-#endif
-
 struct kmembuckets bucket[MINBUCKET + 16];
 struct kmemstats kmemstats[M_LAST];
 struct kmemusage *kmemusage;
@@ -434,8 +422,6 @@ free(addr, type)
 void
 kmeminit_nkmempages()
 {
-   int npages;
-
if (nkmempages != 0) {
/*
 * It's already been set (by us being here before, or
@@ -446,22 +432,13 @@ kmeminit_nkmempages()
 
/*
 * We use the following (simple) formula:
-*
 *  - Starting point is physical memory / 4.
-*
-*  - Clamp it down to NKMEMPAGES_MAX.
-*
-*  - Round it up to NKMEMPAGES_MIN.
+*  - If that is excessive, limit it to NKMEMPAGES_MAX_DEFAULT.
 */
-   npages = physmem / 4;
-
-   if (npages  NKMEMPAGES_MAX)
-   npages = NKMEMPAGES_MAX;
-
-   if (npages  NKMEMPAGES_MIN)
-   npages = NKMEMPAGES_MIN;
+   nkmempages = physmem / 4;
 
-   nkmempages = npages;
+   if (nkmempages  NKMEMPAGES_MAX_DEFAULT)
+   nkmempages = NKMEMPAGES_MAX_DEFAULT;
 }
 
 /*
Index: sys/kern/kern_malloc.c
===
RCS file: /cvs/src/sys/kern/kern_malloc.c,v
retrieving revision 1.102
diff -u -p -r1.102 kern_malloc.c
--- sys/kern/kern_malloc.c  4 Jul 2013 17:35:52 -   1.102
+++ sys/kern/kern_malloc.c  7 Jul 2013 21:54:37 -
@@ -85,8 +85,8 @@ struct vm_map *kmem_map = NULL;
 #endif
 
 /*
- * Default number of pages in kmem_map.  We attempt to calculate this
- * at run-time, but allow it to be either patched or set in the kernel
+ * Default number of pages in kmem_map.  We calculate this at
+ * compile-time, but also allow it to be either patched or set in the kernel
  * config file.
  */
 #ifndef NKMEMPAGES
@@ -94,20 +94,6 @@ struct vm_map *kmem_map = NULL;
 #endif
 u_int  nkmempages = NKMEMPAGES;
 
-/*
- * Defaults for lower- and upper-bounds for the kmem_map page count.
- * Can be overridden by kernel config options.
- */
-#ifndefNKMEMPAGES_MIN
-#defineNKMEMPAGES_MIN  0
-#endif
-u_int  nkmempages_min = 0;
-
-#ifndef NKMEMPAGES_MAX
-#defineNKMEMPAGES_MAX  NKMEMPAGES_MAX_DEFAULT
-#endif
-u_int  nkmempages_max = 0;
-
 struct kmembuckets bucket[MINBUCKET + 16];
 #ifdef KMEMSTATS
 struct kmemstats kmemstats[M_LAST];
@@ -475,8 +461,6 @@ free(void *addr, int type)
 void
 kmeminit_nkmempages(void)
 {
-   u_int npages;
-
if (nkmempages != 0) {
/*
 * It's already been set (by us being here before, or
@@ -490,30 +474,14 @@ kmeminit_nkmempages(void)
 * the page size may not be known (on sparc GENERIC kernels, for
 * example). But we still want the MD code to be able to provide
 * better values.
-*/
-   if (nkmempages_min == 0)
-   nkmempages_min = NKMEMPAGES_MIN;
-   if (nkmempages_max == 0)
-   nkmempages_max = NKMEMPAGES_MAX;
-
-   /*
 * We use the following (simple) formula:
-*
 *  - Starting point is physical memory / 4.
-*
-*  - Clamp it down to nkmempages_max.
-*
-*  - Round it up to nkmempages_min.
+*  - If that is excessive, limit it to NKMEMPAGES_MAX_DEFAULT.
 */
-   npages = physmem / 4;
-
-   if (npages  nkmempages_max)
-   npages = nkmempages_max;
-
-   if (npages  nkmempages_min)
-   npages = nkmempages_min;
+   nkmempages = physmem / 4;
 
-   nkmempages = npages;
+   if (nkmempages  NKMEMPAGES_MAX_DEFAULT)
+   nkmempages = NKMEMPAGES_MAX_DEFAULT;
 }
 
 /*



Re: man pages: wireless frequency nit 2GHz vs 2.4GHz

2013-02-14 Thread Amit Kulkarni
On Thu, Feb 14, 2013 at 6:33 AM, Stuart Henderson s...@spacehopper.orgwrote:

 Amit Kulkarni amitkulz at gmail.com writes:

 
  I was reading the manpages of athn/iwn for purchasing a suitable
 wireless card
 and found repeated
  occurences of 2GHz, when in fact it should be 2.4GHz. That is the
 standard
 frequency when purchasing a
  wireless a/b/g/n card. The code is filled with 2GHz references but just
 changed to man pages in section 4.

 I don't think there is anything 'wrong' in saying operates in the 2GHz
 spectrum. Saying 2.4GHz in documentation seems like it might be a
 good idea as it's more common usage, but it seems wrong to say 2.4GHz
 spectrum, it seems like 2.4GHz band would make more sense if doing
 this. (But then mixing 'band' and 'spectrum' would also be weird so the
 references to 5GHz would need changing to 'bands' (not 'band' there as
 there are 3 non-continuous ranges). Is it worth it? I don't know.


 you guys beat me with a stick if i push more than once, so i will refrain
from commenting on it.

a section of athn.4 manpage is fluff, information is repeated twice. once
in words, other times in a table summarizing the list of working/supported
chips. the same manpage was copied over for iwn, a table of supported chips
would be nice in there too. if there's any interest i will sit down and
submit a manpage diff.


Re: man pages: wireless frequency nit 2GHz vs 2.4GHz

2013-02-14 Thread Amit Kulkarni
  a section of athn.4 manpage is fluff, information is repeated twice. once
  in words, other times in a table summarizing the list of working/supported
  chips. the same manpage was copied over for iwn, a table of supported chips
  would be nice in there too. if there's any interest i will sit down and
  submit a manpage diff.
 
 all diffs welcome.
 jmc
 

i just tried to duplicate the words into the table. any suggestions?

Index: athn.4
===
RCS file: /cvs/src/share/man/man4/athn.4,v
retrieving revision 1.22
diff -u -p -r1.22 athn.4
--- athn.4  14 Feb 2013 07:40:42 -  1.22
+++ athn.4  14 Feb 2013 22:14:17 -
@@ -100,23 +100,26 @@ The following table summarizes the suppo
 .It AR5008-3NG (AR5416+AR2133) Ta 2GHz Ta 3x3:2 Ta PCI/CardBus
 .It AR5008-2NX (AR5416+AR5122) Ta 2GHz/5GHz Ta 2x2:2 Ta PCI/CardBus
 .It AR5008-3NX (AR5416+AR5133) Ta 2GHz/5GHz Ta 3x3:2 Ta PCI/CardBus
-.It AR5008E-2NG (AR5418+AR2122) Ta 2GHz Ta 2x2:2 Ta PCIe
-.It AR5008E-3NG (AR5418+AR2133) Ta 2GHz Ta 3x3:2 Ta PCIe
-.It AR5008E-2NX (AR5418+AR5122) Ta 2GHz/5GHz Ta 2x2:2 Ta PCIe
-.It AR5008E-3NX (AR5418+AR5133) Ta 2GHz/5GHz Ta 3x3:2 Ta PCIe
-.It AR9001-2NG (AR9160+AR9103) Ta 2GHz Ta 2x2:2 Ta PCI
-.It AR9001-3NG (AR9160+AR9103) Ta 2GHz Ta 3x3:2 Ta PCI
-.It AR9001-3NX2 (AR9160+AR9106) Ta 2GHz/5GHz Ta 3x3:2 Ta PCI
-.It AR9220 Ta 2GHz/5GHz Ta 2x2:2 Ta PCI
-.It AR9223 Ta 2GHz Ta 2x2:2 Ta PCI
-.It AR9280 Ta 2GHz/5GHz Ta 2x2:2 Ta PCIe
+.It AR5008E-2NG (AR5418+AR2122) Ta 2GHz Ta 2x2:2 Ta Mini PCIe
+.It AR5008E-3NG (AR5418+AR2133) Ta 2GHz Ta 3x3:2 Ta Mini PCIe
+.It AR5008E-2NX (AR5418+AR5122) Ta 2GHz/5GHz Ta 2x2:2 Ta Mini PCIe
+.It AR5008E-3NX (AR5418+AR5133) Ta 2GHz/5GHz Ta 3x3:2 Ta Mini PCIe
+.It AR9001-2NG (AR9160+AR9103) Ta 2GHz Ta 2x2:2 Ta Mini PCI
+.It AR9001-3NG (AR9160+AR9103) Ta 2GHz Ta 3x3:2 Ta Mini PCI
+.It AR9001-3NX2 (AR9160+AR9106) Ta 2GHz/5GHz Ta 3x3:2 Ta Mini PCI
+.It AR9220 Ta 2GHz/5GHz Ta 2x2:2 Ta PCI/Mini PCI
+.It AR9223 Ta 2GHz Ta 2x2:2 Ta PCI/Mini PCI
+.It AR9280 (XB92) Ta 2GHz/5GHz Ta 2x2:2 Ta Mini PCIe
+.It AR9280 (HB92) Ta 2GHz/5GHz Ta 2x2:2 Ta half Mini PCIe
 .It AR9280+AR7010 Ta 2GHz/5GHz Ta 2x2:2 Ta USB 2.0
-.It AR9281 Ta 2GHz Ta 1x2:2 Ta PCIe
-.It AR9285 Ta 2GHz Ta 1x1:1 Ta PCIe
+.It AR9281 (XB91) Ta 2GHz Ta 1x2:2 Ta Mini PCIe
+.It AR9281 (HB91) Ta 2GHz Ta 1x2:2 Ta half Mini PCIe
+.It AR9285 Ta 2GHz Ta 1x1:1 Ta half Mini PCIe
+.It AR9285+AR3011 Ta 2GHz Ta 1x1:1 Ta half Mini PCIe
 .It AR9271 Ta 2GHz Ta 1x1:1 Ta USB 2.0
-.It AR2427 Ta 2GHz Ta 1x1:1 Ta PCIe
-.It AR9227 Ta 2GHz Ta 2x2:2 Ta PCI
-.It AR9287 Ta 2GHz Ta 2x2:2 Ta PCIe
+.It AR2427 Ta 2GHz Ta 1x1:1 Ta Mini PCIe
+.It AR9227 Ta 2GHz Ta 2x2:2 Ta PCI/Mini PCI
+.It AR9287 Ta 2GHz Ta 2x2:2 Ta half Mini PCIe
 .It AR9287+AR7010 Ta 2GHz Ta 2x2:2 Ta USB 2.0
 .El
 .Pp



man pages: wireless frequency nit 2GHz vs 2.4GHz

2013-02-13 Thread Amit Kulkarni
I was reading the manpages of athn/iwn for purchasing a suitable wireless card 
and found repeated occurences of 2GHz, when in fact it should be 2.4GHz. That 
is the standard frequency when purchasing a wireless a/b/g/n card. The code is 
filled with 2GHz references but just changed to man pages in section 4.

Index: man4/athn.4
===
RCS file: /cvs/src/share/man/man4/athn.4,v
retrieving revision 1.21
diff -u -p -r1.21 athn.4
--- man4/athn.4 17 Sep 2012 11:04:24 -  1.21
+++ man4/athn.4 13 Feb 2013 18:51:46 -
@@ -36,21 +36,21 @@ It consists of two chips, a MAC/Baseband
 The MAC/Baseband Processor can be an AR5416 (PCI and CardBus form factors)
 or an AR5418 (PCIe Mini Card form factor).
 The radio can be an AR2122, AR2133, AR5122 or an AR5133 chip.
-The AR2122 chip operates in the 2GHz spectrum and supports up to 2
+The AR2122 chip operates in the 2.4GHz spectrum and supports up to 2
 transmit paths and 2 receiver paths (2T2R).
-The AR2133 chip operates in the 2GHz spectrum and supports up to 3
+The AR2133 chip operates in the 2.4GHz spectrum and supports up to 3
 transmit paths and 3 receiver paths (3T3R).
-The AR5122 chip operates in the 2GHz and 5GHz spectra and supports
+The AR5122 chip operates in the 2.4GHz and 5GHz spectra and supports
 up to 2 transmit paths and 2 receiver paths (2T2R).
-The AR5133 chip operates in the 2GHz and 5GHz spectra and supports
+The AR5133 chip operates in the 2.4GHz and 5GHz spectra and supports
 up to 3 transmit paths and 3 receiver paths (3T3R).
 .Pp
 The AR9001 (codenamed Sowl) is a Mini-PCI 802.11n solution.
 It consists of two chips, an AR9160 MAC/Baseband Processor and an
 AR9103 or AR9106 Radio-on-a-Chip.
-The AR9103 chip operates in the 2GHz spectrum and supports up to 3
+The AR9103 chip operates in the 2.4GHz spectrum and supports up to 3
 transmit paths and 3 receiver paths (3T3R).
-The AR9106 chip operates in the 2GHz and 5GHz spectra and supports
+The AR9106 chip operates in the 2.4GHz and 5GHz spectra and supports
 up to 3 transmit paths and 3 receiver paths (3T3R).
 .Pp
 The AR9220, AR9223 and AR9280 (codenamed Merlin) are the
@@ -59,65 +59,65 @@ Atheros single-chip 802.11n solutions.
 The AR9220 and AR9223 exist in PCI and Mini-PCI form factors.
 The AR9280 exists in PCIe Mini Card (XB92), half Mini Card (HB92)
 and USB 2.0 (AR9280+AR7010) form factors.
-The AR9220 and AR9280 operate in the 2GHz and 5GHz spectra and
+The AR9220 and AR9280 operate in the 2.4GHz and 5GHz spectra and
 support 2 transmit paths and 2 receiver paths (2T2R).
-The AR9223 operates in the 2GHz spectrum and supports 2
+The AR9223 operates in the 2.4GHz spectrum and supports 2
 transmit paths and 2 receiver paths (2T2R).
 .Pp
 The AR9281 is a single-chip PCIe 802.11n solution.
 It exists in PCIe Mini Card (XB91) and half Mini Card (HB91) form
 factors.
-It operates in the 2GHz spectrum and supports 1 transmit path and
+It operates in the 2.4GHz spectrum and supports 1 transmit path and
 2 receiver paths (1T2R).
 .Pp
 The AR9285 (codenamed Kite) is a single-chip PCIe 802.11n solution that
 targets the value PC market.
 It exists in PCIe half Mini Card (HB95) form factor only.
-It operates in the 2GHz spectrum and supports a single stream (1T1R).
+It operates in the 2.4GHz spectrum and supports a single stream (1T1R).
 It can be combined with the AR3011 chip to form a combo WiFi/Bluetooth
 device (WB195).
 .Pp
 The AR9271 is a single-chip USB 2.0 802.11n solution.
-It operates in the 2GHz spectrum and supports a single stream (1T1R).
+It operates in the 2.4GHz spectrum and supports a single stream (1T1R).
 .Pp
 The AR2427 is a single-chip PCIe 802.11b/g solution similar to the other
 AR9280 solutions but with 802.11n capabilities removed.
 It exists in PCIe Mini Card form factor only.
-It operates in the 2GHz spectrum.
+It operates in the 2.4GHz spectrum.
 .Pp
 The AR9227 and AR9287 are single-chip 802.11n solutions that
 target mid-tier PCs.
 The AR9227 exists in PCI and Mini-PCI form factors.
 The AR9287 exists in PCIe half Mini Card (HB97)
 and USB 2.0 (AR9287+AR7010) form factors.
-They operate in the 2GHz spectrum and support 2 transmit paths and 2
+They operate in the 2.4GHz spectrum and support 2 transmit paths and 2
 receiver paths (2T2R).
 .Pp
 The following table summarizes the supported chips and their capabilities.
-.Bl -column AR9001-3NX2 (AR9160+AR9106) 2GHz/5GHz 3x3:3 PCI/CardBus 
-offset 6n
+.Bl -column AR9001-3NX2 (AR9160+AR9106) 2.4GHz/5GHz 3x3:3 PCI/CardBus 
-offset 6n
 .It Em Chipset Ta Em Spectrum Ta Em TxR:S Ta Em Bus
-.It AR5008-2NG (AR5416+AR2122) Ta 2GHz Ta 2x2:2 Ta PCI/CardBus
-.It AR5008-3NG (AR5416+AR2133) Ta 2GHz Ta 3x3:2 Ta PCI/CardBus
-.It AR5008-2NX (AR5416+AR5122) Ta 2GHz/5GHz Ta 2x2:2 Ta PCI/CardBus
-.It AR5008-3NX (AR5416+AR5133) Ta 2GHz/5GHz Ta 3x3:2 Ta PCI/CardBus
-.It AR5008E-2NG (AR5418+AR2122) Ta 2GHz Ta 2x2:2 Ta PCIe
-.It AR5008E-3NG (AR5418+AR2133) Ta 2GHz Ta 3x3:2 Ta PCIe

missing include in sys/proc.h?

2012-11-20 Thread Amit Kulkarni
While porting boost-1.52.0 (latest boost) and checking affected ports, I came 
across this.

c++ -o src/mapped_memory_cache.os -c -DHAVE_JPEG -pthread -ansi -Wall -pthread 
-ftemplate-depth-300 -DOPENBSD -DBOOST_SPIRIT_THREADSAFE -DMAPNIK_THREADSAFE 
-O3 -finline-functions -Wno-inline -DNDEBUG -DHAVE_CAIRO -fPIC 
-I/usr/local/include/cairomm-1.0 -I/usr/local/lib/cairomm-1.0/include 
-I/usr/local/include/cairo -I/usr/local/include/glib-2.0 
-I/usr/local/lib/glib-2.0/include -I/usr/X11R6/include/pixman-1 
-I/usr/include/dev/pci/drm -I/usr/local/include/sigc++-2.0 
-I/usr/local/lib/sigc++-2.0/include -I. -Iinclude -I/usr/local/include/agg2 
-I/usr/local/include/postgresql -I/usr/local/include/libxml2 
-I/usr/local/include -I/usr/local/include/libpng -I/usr/include 
-I/usr/X11R6/include/freetype2 -I/usr/X11R6/include src/mapped_memory_cache.cpp
In file included from /usr/include/sys/sysctl.h:42,
 from 
/usr/local/include/boost/interprocess/detail/workaround.hpp:113,
 from 
/usr/local/include/boost/interprocess/mapped_region.hpp:15,
 from include/mapnik/mapped_memory_cache.hpp:34,
 from src/mapped_memory_cache.cpp:26:
/usr/include/sys/proc.h:65: error: 'MAXLOGNAME' was not declared in this scope
/usr/include/sys/proc.h:321: error: 'MAXCOMLEN' was not declared in this scope
/usr/local/include/boost/system/error_code.hpp:214: warning: 'boost::system::pos
ix_category' defined but not used
/usr/local/include/boost/system/error_code.hpp:215: warning: 
'boost::system::errno_ecat' defined but not used
/usr/local/include/boost/system/error_code.hpp:216: warning: 
'boost::system::native_ecat' defined but not used
scons: *** [src/mapped_memory_cache.os] Error 1
scons: building terminated because of errors.
*** Error 2 in . (/home/amit/obsd/ports/devel/scons/scons.port.mk:24 
'do-build': @/usr/bin/env -i CC=cc PYTHONUSERBASE=/home/amit/obsd/ports...)
*** Error 1 in . (/home/amit/obsd/ports/infrastructure/mk/bsd.port.mk:2556 
'/home/amit/obsd/ports/pobj/mapnik-2.0.1/.build_done')
*** Error 1 in /home/amit/obsd/ports/graphics/mapnik 
(/home/amit/obsd/ports/infrastructure/mk/bsd.port.mk:2280 'all')


I can patch and include sys/param.h in this graphics/mapnik port but sys/proc.h 
doesn't include to sys/param.h? Is this a bug?

a yes/no confirmation is appreciated.



Re: upstream vendors and why they can be really harmful

2012-11-06 Thread Amit Kulkarni
 Basically, we have a pattern, mostly observed with kde (and a bit with
 gnome) which is really harmful for us.

 They occupy a few people in our team FULLTIME with respect to gnome, they're
 the reason we still DON'T have a full kde4 in our tree (hopefully to be
 addressed shortly), and they're the reason why sometimes we do drop old
 stuff (like killing gtk+1, and people really wanting to kill some gtk2/qt3
 stuff).

 It's also quickly turning Posix and Unix into a travesty: either you have
 the linux goodies, or you don't. And if you don't, you can forget anything
 modern...


 Not to disparage the hard work by Antoine and others on Gnome and KDE, but if 
 upstream are going to entwine their code with non-standard OSs, then why 
 bother with them? If everyone but the mainstream Linux distros dropped their 
 projects, it seems a more likely way of getting through to the upstream 
 developers than joining their project or sending them emails.

eventually they get the message but sometimes its too late.


 I use Joe's Window Manager, it compiles in less than a minute straight from 
 the sources with no patching or tweaking. I don't have semi-transparent 
 windowbars and I had to make a couple of tweaks so I could hear a beep when 
 I get an IM, apart from that, what can a modern window manager do that is 
 worth the some porter's pain (and extra 10-20% cpu consumption to run) anyway?

 Stuff like X is a different matter, if upstream must be battled, I would say 
 send the troops to defend what is hard to do without, not what is easy to do 
 without.


I care about some things in kde4 like kdevelop/kate, educational apps
for kids (very few mainstream distros bother for children), for qt4/5
the ability to write a single code for mobile + desktop amongst some
things. There are some things in the big WMs which are missing in JWM
:-) Probably other guys have different motivations.



Re: ln -s example

2012-09-20 Thread Amit Kulkarni
  shouldn't this order be flipped?
 

 the example does what its description says. why do you think it should
 be reversed?

 because people are often confused by symlinks? I always tell the
 confused: the order is the same as cp(1): the first argument needs to
 exits, the second one is created.

 -Otto

This is very helpful. Usually in OpenBSD, you create a symbolic link
/var/www which has limited space and have it point to /home/www where
actual data is stored and which has more space.

This particular example could be

Create a symbolic link named /var/www and point it to /home/www:

 # ln -s /home/www /var/www



ln -s example

2012-09-19 Thread Amit Kulkarni
shouldn't this order be flipped?

Index: ln.1
===
RCS file: /cvs/src/bin/ln/ln.1,v
retrieving revision 1.29
diff -u -p -r1.29 ln.1
--- ln.12 Mar 2011 07:47:21 -   1.29
+++ ln.119 Sep 2012 23:27:04 -
@@ -130,7 +130,7 @@ Create a symbolic link named
 and point it to
 .Pa /var/www :
 .Pp
-.Dl # ln -s /var/www /home/www
+.Dl # ln -s /home/www /var/www
 .Pp
 Hard link
 .Pa /usr/local/bin/fooprog



Re: Hibernate support

2012-07-11 Thread Amit Kulkarni
 2. Hibernate writes to swap (at the end of your swap). If you have too small
 a swap, it won't work, or if there are swap pages in use at the end of your
 swap that overlap with what we want. You need at least size of mem + 64MB
 of swap at the end of swap, free, at the time of hibernate.

mike/theo, first off big thanks for working on this!

pardon my newbie question but does this ^^^ above paragraph mean i
might need to create a new swap like this? assume a mem of 8 GB. (i
will be re-partitioning within a few months to get a bigger
/usr/local, /usr, and /var)

new swap = 8GB (for real swap) + 8GB + 64MB (for hibernate purpose)

i would conservatively (worst case scenario) opt for the option above.
please advise if its right. the current swap on the openbsd machine is
the default swap chosen by the installer.

windows (they do have one of the best hibernate around) creates a
separate pagefile (swap) and a hibernate file. any thoughts of having
a /var/hibernate or something along those lines?

thanks



Re: C Programming Language - KR books to be given...

2012-07-02 Thread Amit Kulkarni
On Mon, Jul 2, 2012 at 4:47 AM, Amarendra Godbole
amarendra.godb...@gmail.com wrote:
 Hi misc@, tech@,

 If it is difficult to grab hold of a copy of KR 2nd ed., please drop
 me a private note -- I have a bunch of copies (5) which I can send
 across your way as a gift. I'll probably ask you to cover the shipping
 (~$6 US). These are Indian reprints which cost a lot less here in
 India (~$2.5 US), than they do in the US or the EU. Thank you.

 -Amarendra


they are cheap in india for a specific reason, and they are expensive
in US/EU for another specific reason.

this is getting into import/export. be careful.

good luck



Re: Remove timezone support from the kernel

2012-04-24 Thread Amit Kulkarni
On Tue, Apr 24, 2012 at 7:35 AM, Stuart Henderson s...@spacehopper.org
wrote:
 On 2012/04/24 16:27, Vadim Zhukov wrote:
 23 P0P?QP5P;Q  2012B P3. 21:37 P?PP;Q P7PP2P0Q P5P;Q Matthew Dempsky
 matt...@dempsky.org P=P0P?P8Q P0P;:
  There's no reason for the kernel to track the system's timezone
  anymore. B This is handled in userspace by the TZ environment variable,
  and POSIX doesn't even define what happens if you pass a non-NULL
  pointer as the 'struct timezone *' argument to gettimeofday() (and
  settimeofday() has never been in POSIX).
 
  The diff below:
  B - eliminates tz
  B - adds a compile-time check to detect configs with non-0 timezone
  B - changes settimeofday() to return EINVAL when given a non-0 timezone
  B - eliminates the userconf code for changing/printing the timezone
  B - removes clock and msdosfs code that looks at the kernel timezone
 
  After this, we'll be able to move gettimeofday() and settimeofday()
  into libc as user-space wrappers around clock_gettime() and
  clock_settime(), respectively.
 
  Any objections?

 This will somewhat break dual-booting machines with Windblows as
 second OS. :( But I'm not a developer and do not have any vote, of
 course. :)

 It seems simpler to use NTP to fetch a correct time, than to build a custom
 kernel.

I hope to find a way to keep Windows time unmodified while OpenBSD adjusts.
i.e its currently documented in the FAQ
http://www.openbsd.org/faq/faq8.html#TimeZone

The installer asks for the TZ and adjusts /etc/localtime but until I
changed the option TIMEZONE=value in kernel the clock was off

thanks



Re: Remove timezone support from the kernel

2012-04-24 Thread Amit Kulkarni
On Tue, 24 Apr 2012 15:06:42 -0700
Matthew Dempsky matt...@dempsky.org wrote:

 On Tue, Apr 24, 2012 at 04:27:00PM +0400, Vadim Zhukov wrote:
  This will somewhat break dual-booting machines with Windblows as
  second OS. :(
 
 Okay, here's an alternative diff that only affects gettimeofday() and
 settimeofday(). Users can still set the kernel timezone through
 config(8) or boot_config(8), and it will still be used to compensate
 for a non-UTC clock, but that's all the kernel timezone will be used
 for. As far as userland will see, the kernel will appear hard
 configured for UTC: gettimeofday() will always return a UTC timezone
 and settimeofday() will return EINVAL if you try to set a non-UTC
 timezone.
 
 In base, this should only affect users of date(1)'s -d and -t options,
 but I plan to remove those anyway if this diff goes in.
 
 I'm very interested in hearing from Windows dual-booters who use local
 time RTCs whether this has any negative consequences for them.

No problems on amd64, with ntpd_flags= -s in /etc/rc.conf.local

thanks

 Index: kern_time.c
 ===
 RCS file: /home/mdempsky/anoncvs/cvs/src/sys/kern/kern_time.c,v
 retrieving revision 1.74
 diff -u -p -r1.74 kern_time.c
 --- kern_time.c   23 Mar 2012 15:51:26 -  1.74
 +++ kern_time.c   24 Apr 2012 16:17:44 -
 @@ -391,8 +458,10 @@ sys_gettimeofday(struct proc *p, void *v
   }
  #endif
   }
 - if (tzp)
 - error = copyout(tz, tzp, sizeof (tz));
 + if (tzp) {
 + const struct timezone tz0 = { 0 };
 + error = copyout(tz0, tzp, sizeof(tz0));
 + }
   return (error);
  }
  
 @@ -415,20 +484,22 @@ sys_settimeofday(struct proc *p, void *v
  
   if ((error = suser(p, 0)))
   return (error);
 - /* Verify all parameters before changing time. */
 - if (tv  (error = copyin(tv, atv, sizeof(atv
 - return (error);
 - if (tzp  (error = copyin(tzp, atz, sizeof(atz
 - return (error);
 + if (tzp) {
 + if ((error = copyin(tzp, atz, sizeof(atz))) != 0)
 + return (error);
 + if (atz.tz_minuteswest != 0 || atz.tz_dsttime != 0)
 + return (EINVAL);
 + }
   if (tv) {
   struct timespec ts;
  
 + if ((error = copyin(tv, atv, sizeof(atv))) != 0)
 + return (error);
 +
   TIMEVAL_TO_TIMESPEC(atv, ts);
   if ((error = settime(ts)) != 0)
   return (error);
   }
 - if (tzp)
 - tz = atz;
   return (0);
  }
 


-- 
Amit Kulkarni amitk...@gmail.com



Re: Making time_t deal with the coming epoch

2012-04-01 Thread Amit Kulkarni
err. this is a april fool right, guys?

:-)

On Sun, Apr 1, 2012 at 1:13 PM, Nicholas Marriott
nicholas.marri...@gmail.com wrote:
 Can I make it wake up again automatically three days later?


 On Sun, Apr 01, 2012 at 06:35:08PM +0200, Benny Lofgren wrote:
 On 2012-04-01 09.05, Theo de Raadt wrote:
  The epoch isn't far that away and we need to prepare OpenBSD for it.
 
  I had a little free time, so I wrote a diff to simulate the behaviour
  so that we can test how parts of OpenBSD cope with it.

 May I suggest a more versatile and flexible approach to the problem?

 Here's the man page diff:

 --- /dev/null Sun Apr  1 18:25:35 2012
 +++ lib/libc/sys/getendofdays.2   Sun Apr  1 18:24:23 2012
 @@ -0,0 +1,139 @@
 +.\  $OpenBSD: getendofdays.2,v 1.00 2012/04/01 21:00:59 hcamping Exp $
 +.\
 +.\ Copyright (c) 1980, 1991, 1993, 13 b'ak'tun
 +.\  The Regents of the University of California.  All rights reserved.
 +.\
 +.\ Redistribution and use in source and binary forms, with or without
 +.\ modification, are permitted provided that the following conditions
 +.\ are met:
 +.\ 1. Redistributions of source code must retain the above copyright
 +.\notice, this list of conditions and the following disclaimer.
 +.\ 2. Redistributions in binary form must reproduce the above copyright
 +.\notice, this list of conditions and the following disclaimer in
the
 +.\documentation and/or other materials provided with the
distribution.
 +.\ 3. Neither the name of the University nor the names of its
contributors
 +.\may be used to endorse or promote products derived from this
 software
 +.\without specific prior written permission.
 +.\
 +.\ THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS''
AND
 +.\ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 +.\ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 PURPOSE
 +.\ ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE
 LIABLE
 +.\ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 CONSEQUENTIAL
 +.\ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS
 +.\ OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 +.\ HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 STRICT
 +.\ LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
 ANY WAY
 +.\ OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF
 +.\ SUCH DAMAGE.
 +.\
 +.\ @(#)getendofdays.2   8.2 (Berkeley) 4/1/2012
 +.\
 +.Dd $Mdocdate: Apr 1 2012 $
 +.Dt getendofdays 2
 +.Os
 +.Sh NAME
 +.Nm getendofdays ,
 +.Nm setendofdays
 +.Nd get/set timestamp of current end of times
 +.Sh SYNOPSIS
 +.Fd #include sys/time.h
 +.Ft int
 +.Fn getendofdays struct timeval *raptp struct timeval *repentp
 +.Ft int
 +.Fn setendofdays const struct timeval *raptp const struct timeval
 *repentp
 +.Sh DESCRIPTION
 +.Bf -symbolic
 +Note: timezone is no longer used; this information is kept outside
 +the kernel. In any event, it won't be needed after.
 +.Ef
 +.Pp
 +The system's notion of when the end of days, also known as EOE (End of
 Epoch),
 +occurs, as well as at what time the system will automatically be shut
 down in
 +order to preserve itself is obtained with the
 +.Fn getendofdays
 +call, and set with the
 +.Fn setendofdays
 +call.
 +The time is expressed in seconds and microseconds
 +since midnight (0 hour), January 1, 1970.
 +The resolution of the system clock is hardware dependent, and the time
 +may be updated continuously or in
 +.Dq ticks .
 +If
 +.Fa raptp
 +is
 +.Dv NULL ,
 +the associated time
 +information will not be returned, or the default EOE of Jan 19, 2038
 +will be set, respectively.
 +.Pp
 +If
 +.Fa repentp
 +is
 +.Dv NULL ,
 +the associated time
 +information will not be returned, and there will be no repentive system
 +shutdown performed (not recommended).
 +.Pp
 +The structure pointed to by
 +.Fa raptp
 +and
 +.Fa repentp
 +is defined in
 +.Aq Pa sys/time.h
 +as:
 +.Bd -literal
 +struct timeval {
 + longtv_sec; /* seconds since Jan. 1, 1970 */
 + longtv_usec;/* and microseconds */
 +};
 +
 +.Ed
 +.Pp
 +Only the superuser may set the end of days.
 +If the system securelevel is greater than 1 (see
 +.Xr init 8 ) ,
 +the end of days may only be advanced.
 +This limitation is imposed to prevent a malicious superuser
 +from setting arbitrary ends of days.
 +.Sh RETURN VALUES
 +A 0 return value indicates that the call succeeded and that the end is
 nigh.
 +A \-1 return value indicates an error occurred, and in this
 +case an error code is stored into the global variable
 +.Va errno .
 +.Sh ERRORS
 +The following error codes may be set in
 +.Va errno :
 +.Bl -tag -width [EFAULT]
 +.It Bq Er EFAULT
 +An argument address referenced invalid memory.
 +.It Bq Er EPERM
 +A user other than the superuser attempted to set the end of days.
 +.It Bq Er EBUSY
 +The end of days is already in progress.
 +.It Bq Er ETIMEDOUT

Re: Allow clang++ to work on OpenBSD

2011-12-12 Thread Amit Kulkarni
On Mon, Dec 12, 2011 at 10:06 AM, Pascal Stumpf pascal.stu...@cubes.de
wrote:
 On Mon, 12 Dec 2011 16:55:04 +0100 (CET), Mark Kettenis wrote:
  Date: Mon, 12 Dec 2011 16:51:48 +0100
  From: Pascal Stumpf  pascal.stu...@cubes.de
 
  On Mon, 12 Dec 2011 16:26:42 +0100, Marc Espie wrote:
   On Mon, Dec 12, 2011 at 04:00:44PM +0100, Pascal Stumpf wrote:
On Mon, 12 Dec 2011 14:41:45 +0100 (CET), Mark Kettenis wrote:

 The s/restrict/__restrict/g in cstdio shouldn't be necessary.
   
Apparently, clang++ interprets restrict as parameter name, i.e.:
   
attr.cc:1:50: error: redefinition of parameter 'restrict'
extern C int foo(const char * restrict, char * restrict, ...)
 ^
attr.cc:1:33: note: previous declaration is here
extern C int foo(const char * restrict, char * restrict, ...)
^
   
This might indeed be a bug, but I'd have to read the C++ standard to
be
sure.  In pure C, clang doesn't complain.
  
   I'm not that surprised. restrict is C99.  It's not part of C++98.
  
   Googling for restrict and C++ show various bug-reports explicitly
stating
  tha
   t
   library headers should probably adapt.
  
   I don't have access to C++ 2011 yet, but from n3242, it seems that it
doe
 s
   now refer to C99 instead of C89, so restrict is probably leggit in
C++201
 1.
  
   So it looks to me like clang in C++98 mode is totally right to not
recogn
 ize
   restrict as a keyword!
 
  Yes, you're right.  And clang++ -std=c++0x does recognise restrict as a
  keyword.  cstdio should be adapted (and gcc 4.6 does indeed have
  __restrict over restrict).

 Still worth checking if only removing the XXX_CHECK defines and
 leaving the XXX_DYNAMIC defines helps.

 Yes, it does. :)  Thanks!


pascal,
thanks for the persistent follow-up. all, please get this in.

finally clang++ is usable, and can be used without any problems in ports!

thanks



Re: tar -J for xz

2011-10-04 Thread Amit Kulkarni
 this diff adds a -J flag to tar that calls xz for compress/decompress.
 Requires you to install the xz package on your system.

 No way.

 Base never depends on external things.


http://git.tukaani.org/?p=xz.git;a=tree

Probably can't import xz into base as parts of xz are GPL v3.



Re: wscanf

2011-09-22 Thread Amit Kulkarni
  wscanf based on our scanf implementation. The delta from narrow
  to wide character support was obtained from FreeBSD and modified
  to fit our code.
 
  Does not include a libc bump yet!

 Here are corresponding libstdc++ changes, also without bumps.

 The gcc3 version will have to be revisited once we also get wcsftime.
 But for now it should work like this (the configure script doesn't define
 _GLIBCPP_USE_WCHAR_T unless it finds all wchar functions it wants).

 Not sure if the c_compatibility change is really needed.
 We didn't update it for wprintf (by accident?)

 As with wprintf, no actual changes for gcc2 -- we'll just bump the lib
 for safety.

 Can someone run this (together with the libc wscanf diff) through a build
 on gcc2 and gcc3 architectures?

 Fwiw, this went on a full amd64 bulk build without issues in the ports
 tree. Yay!


i remember that there were ports where wprintf was enabled but wscanf
was disabled during configure, were they now turned to enable both?

great news indeed!!!

thanks



Re: ksh history corruption

2011-09-01 Thread Amit Kulkarni
FWIW, I like this... makes it better for me to atleast grab some stuff
and copy over ***if*** it corrupts again.

thanks

On Thu, Sep 1, 2011 at 3:22 PM, Marco Peereboom sl...@peereboom.us wrote:
 Alright this diff keeps the file open and appends lines to HISTFILE.  It
 only rewrites HISTFILE at 125% of HISTSIZE.  Does fancy locking and
 deals with signals too.  So unless someone finds some bugs I'll consider
 this version final.

 Yes, no on moving ksh history to text?

 Other comments?

 Index: alloc.c
 ===
 RCS file: /cvs/src/bin/ksh/alloc.c,v
 retrieving revision 1.8
 diff -u -p -r1.8 alloc.c
 --- alloc.c 21 Jul 2008 17:30:08 -  1.8
 +++ alloc.c 30 Aug 2011 18:05:47 -
 @@ -62,7 +62,7 @@ alloc(size_t size, Area *ap)
  {
struct link *l;

 -   l = malloc(sizeof(struct link) + size);
 +   l = calloc(1, sizeof(struct link) + size);
if (l == NULL)
internal_errorf(1, unable to allocate memory);
l-next = ap-freelist;
 Index: history.c
 ===
 RCS file: /cvs/src/bin/ksh/history.c,v
 retrieving revision 1.39
 diff -u -p -r1.39 history.c
 --- history.c   19 May 2010 17:36:08 -  1.39
 +++ history.c   1 Sep 2011 20:14:44 -
 @@ -11,8 +11,7 @@
  * a)  the original in-memory history  mechanism
  * b)  a more complicated mechanism done by  p...@hillside.co.uk
  * that more closely follows the real ksh way of doing
 - * things. You need to have the mmap system call for this
 - * to work on your system
 + * things.
  */

  #include sh.h
 @@ -20,21 +19,11 @@

  #ifdef HISTORY
  # include sys/file.h
 -# include sys/mman.h

 -/*
 - * variables for handling the data file
 - */
 -static int histfd;
 -static int hsize;
 -
 -static int hist_count_lines(unsigned char *, int);
 -static int hist_shrink(unsigned char *, int);
 -static unsigned char *hist_skip_back(unsigned char *,int *,int);
 -static void histload(Source *, unsigned char *, int);
 -static void histinsert(Source *, int, unsigned char *);
 -static void writehistfile(int, char *);
 -static int sprinkle(int);
 +static voidwritehistfile(void);
 +static FILE*history_open(void);
 +static int history_load(Source *);
 +static voidhistory_close(void);

  static int hist_execute(char *);
  static int hist_replace(char **, const char *, const char *, int);
 @@ -42,11 +31,14 @@ static char   **hist_get(const char *, i
  static char   **hist_get_oldest(void);
  static voidhistbackup(void);

 +static FILE*histfd;
  static char   **current;   /* current position in history[] */
  static char*hname; /* current name of history file */
  static int hstarted;   /* set after hist_init() called */
 -static Source  *hist_source;
 +static Source  *hist_source;
 +static uint32_tline_co;

 +static struct stat last_sb;

  int
  c_fc(char **wp)
 @@ -529,15 +521,10 @@ sethistfile(const char *name)
/* if the name is the same as the name we have */
if (hname  strcmp(hname, name) == 0)
return;
 -
/*
 * its a new name - possibly
 */
 -   if (histfd) {
 -   /* yes the file is open */
 -   (void) close(histfd);
 -   histfd = 0;
 -   hsize = 0;
 +   if (hname) {
afree(hname, APERM);
hname = NULL;
/* let's reset the history */
 @@ -545,6 +532,9 @@ sethistfile(const char *name)
hist_source-line = 0;
}

 +   if (histfd)
 +   history_close();
 +
hist_init(hist_source);
  }

 @@ -561,6 +551,27 @@ init_histvec(void)
}
  }

 +static void
 +history_lock(void)
 +{
 +   while (flock(fileno(histfd), LOCK_EX) != 0) {
 +   if (errno == EINTR || errno == EAGAIN)
 +   continue;
 +   else
 +   break;
 +   }
 +}
 +
 +static void
 +history_unlock(void)
 +{
 +   while (flock(fileno(histfd), LOCK_UN) != 0) {
 +   if (errno == EINTR || errno == EAGAIN)
 +   continue;
 +   else
 +   break;
 +   }
 +}

  /*
  * Routines added by Peter Collinson BSDI(Europe)/Hillside Systems to
 @@ -577,18 +588,29 @@ init_histvec(void)
  void
  histsave(int lno, const char *cmd, int dowrite)
  {
 -   char **hp;
 -   char *c, *cp;
 +   char**hp;
 +   char*c, *cp;
 +   struct stat sb;
 +
 +   if (dowrite  histfd) {
 +   history_lock();
 +   if (fstat(fileno(histfd), sb) != -1) {
 +   if (timespeccmp(sb.st_mtim, last_sb.st_mtim, ==))
 +   ; /* file is unchanged */
 +   else {
 +

Re: custom udp/tcp ports for mountd, rpc.statd, rpc.lockd

2011-07-27 Thread Amit Kulkarni
  Do you have a particular usage that needs this?

 No, I just run a local nfs server; at the moment only serving one
 single, trusted client.
 So I'm not in desperate need for fixed ports, but I think fixed ports
 are a lot cleaner and over all easier to maintain.

 RPC does not work that way.  It uses the portmapper at port 111 for
 discovery.  NFS at 2049 is also a known port.  The rest are supposed
 to be unknown.

Windows does the same thing in RPC, but slightly better (better for
filtering rules purpose, kills the randomness though). It restricts
RPC ports to a defined range and you can then filter based on this
range. I did this in the Windows 2000/XP days with default firewall
during the 2003 days.

You have to find a way to restrict the RPC ports to a range in OpenBSD
(I don't think Theo or the others would go for this) or maybe write a
PF rule using portmapper for discovery (don't know how to do this as I
am pf newbie still, no rush).



Re: setrlimit code

2011-07-26 Thread Amit Kulkarni
   remove that rlimit code, rc.d and login classes do it much betterer
these
   days. screaming bob ok claudio
  
 
  Similar code is in a few other places, if it's going from bgpd it
  should be zapped there too. Should we add login.conf sections for
  bgpd, ftp-proxy, inetd, relayd, smtpd and spamd, or just bump
  the default openfiles-cur for daemon to 512 or something and
  maybe add sections for a few daemons that might still have
  trouble with 512 (partly to help with the limits, and partly
  to show people how it's done)?

 At least spamd does set the openfiles-cur based on the maximum number of
 connections a users specifies. That code is OK. Daemons setting the limit
 to infinity on the otherhands are not.

 Should we give an example for at least bgpd? 128 FDs is a bit
 tight for a peering router and we could do with having at least
 one daemon shown in login.conf as an example of how to configure
 this.


 Index: login.conf.in
 ===
 RCS file: /cvs/src/etc/login.conf.in,v
 retrieving revision 1.3
 diff -u -p -r1.3 login.conf.in
 --- login.conf.in   17 Dec 2010 05:33:06 -  1.3
 +++ login.conf.in   26 Jul 2011 12:31:02 -
 @@ -84,3 +84,10 @@ authpf:\
:welcome=/etc/motd.authpf:\
:shell=/usr/sbin/authpf:\
:tc=default:
 +
 +#
 +# Override resource limits for certain daemons started by rc.d(8)
 +#
 +bgpd:\
 +   :openfiles-cur=512:\
 +   :tc=daemon:


very elegant!

IMHO login.conf should be set based on a combination of what you have
enabled (pf ruleset is huge, bgpd, spamd etc) and your hardware. can
this be shifted to sysmerge, so sysmerge can periodically make a
decision when machine configuration changes it will automagically bump
limits?



Re: Fix running ldd against multiple shared objects

2011-06-24 Thread Amit Kulkarni
On Fri, Apr 8, 2011 at 6:17 PM, Matthew Dempsky matt...@dempsky.org wrote:
 Diff below fixes ldd /usr/lib/*.so.* so that it outputs more than
 just the first shared object's dependencies, like the behavior from
 ldd /usr/bin/*.

 The issue is that dlopen(f, RTLD_TRACE) calls exit() after it's done.
 I looked into changing this to properly cleanup and return, but
 figured it was easier to just fork first like we already do for
 exec()-based tracing.

 ok?

Is anybody interested in getting this diff in?
Thanks



Re: vmmap replacement, please test

2011-05-16 Thread Amit Kulkarni
 there is a newer diff; I won't post the whole thing but in
 uvm_map.c around line 5954, replace this:

KASSERT(start = srcmap-min_offset  end = srcmap-max_offset);

 with this:

if ((start  PAGE_MASK) != 0 || (end  PAGE_MASK) != 0 || end 
start)
return EINVAL;
if (start  srcmap-min_offset || end  srcmap-max_offset)
return EINVAL;

you are off by about 2000 lines.

it is on line 3947 after applying ariane@ original vmmap diff

:-)



Re: Filesystem Hierarchy Standard (FHS) and OpenBSD

2011-05-10 Thread Amit Kulkarni
 system32 - 64 bit dll + apps
 sysWOW - 32 bit dll + apps

 How's that for backwards compatibility.


That's utterly ridiculous. The guy responsible for such things should
be fired :)



Re: malloc: speedup chunk housekeeping

2011-05-05 Thread Amit Kulkarni
 The random number is derived from a global, which is incremented by a
 few bits every time a chunk is needed (with a small optimization if
 only one free slot is left).
 

I have no feedback on this diff but a question on random placing in 
another two functions.

In static void unmap()
for (i = 0; tounmap  0  i  mopts.malloc_cache; i++) {
r = d-free_regions[(i + offset)  (mopts.malloc_cache - 
1)];

In static void map()
for (i = 0; i  mopts.malloc_cache; i++) {
r = d-free_regions[(i + offset)  (mopts.malloc_cache - 
1)];

AFAIK
malloc_cache = 64
offset = {0 , 15} interval
free_regions[MALLOC_MAXCACHE] = 256

the effect of logical 'and' is useless because you are only 
really indexing free_regions from i+15, max of 64+15.

If you want to index free_regions randomly over its full range, maybe you 
should do something else?

Thanks,
amit 



Re: malloc: speedup chunk housekeeping

2011-05-05 Thread Amit Kulkarni
 malloc_cache is a power of the, so a bitwise and with malloc_cache - 1
 is equivalent to modulo malloc_cache.

 of two, that is.

 Room is reserved for MALLOC_MAXCACHE pointers, but only malloc_cache
 are ever used. So doing a modulo malloc_cache is ok.

Ahh, sorry for that. I was thrown by that 256!



Re: Fix running ldd against multiple shared objects

2011-04-30 Thread Amit Kulkarni
 hi. i have commented out the -x stuff for now, and removed it from
 usage(). the rest of your diff is for other people though, so i'll leave
 it there.

 jmc

It isn't my diff, its from matthew@. I just diffed on top of it.
thanks



Re: Fix running ldd against multiple shared objects

2011-04-28 Thread Amit Kulkarni
 Diff below fixes ldd /usr/lib/*.so.* so that it outputs more than
 just the first shared object's dependencies, like the behavior from
 ldd /usr/bin/*.
 
 The issue is that dlopen(f, RTLD_TRACE) calls exit() after it's done.
 I looked into changing this to properly cleanup and return, but
 figured it was easier to just fork first like we already do for
 exec()-based tracing.


I was checking this diff after the libc.so and libm.so change today, to 
verify what's on the machine.

ldd -x is unused, so I removed it from manpage and ldd.c, diff attached at 
the very end of this email.

when you do ldd /usr/lib/*.so.* you get a funny output

/usr/lib/libasn1.so.18.0:
StartEnd  Type Open Ref GrpRef Name
000207297000 000207735000 dlib 10   0  
/usr/lib/libasn1.so.18.0
/usr/lib/libc.so.58.0:
StartEnd  Type Open Ref GrpRef Name
00020151d000 000201a03000 dlib 10   0  
/usr/lib/libc.so.58.2
/usr/lib/libc.so.58.1:
StartEnd  Type Open Ref GrpRef Name
000203abe000 000203fa4000 dlib 10   0  
/usr/lib/libc.so.58.2
/usr/lib/libc.so.58.2:
StartEnd  Type Open Ref GrpRef Name
000208fc4000 0002094aa000 dlib 10   0  
/usr/lib/libc.so.58.2
/usr/lib/libc.so.59.1:
/usr/lib/libcom_err.so.18.0:
StartEnd  Type Open Ref GrpRef Name
000203b1 000203fae000 dlib 10   0  
/usr/lib/libcom_err.so.18.0




/usr/lib/libkvm.so.10.0:
StartEnd  Type Open Ref GrpRef Name
000206891000 000206c9a000 dlib 10   0  
/usr/lib/libkvm.so.10.0
/usr/lib/libm.so.5.2:
StartEnd  Type Open Ref GrpRef Name
00020d6b1000 00020dacf000 dlib 10   0  
/usr/lib/libm.so.5.3
/usr/lib/libm.so.5.3:
StartEnd  Type Open Ref GrpRef Name
0002042ef000 00020470d000 dlib 10   0  
/usr/lib/libm.so.5.3
/usr/lib/libmenu.so.5.0:



Notice that ldd now prints the correct file and the Start + End 
sections are different but it prints the wrong filename for the Start + 
End sections.

ldd now prints libc.so.58.2 for all Start + End sections and is silent 
about current libc.so.59.1
ldd now prints libm.so.5.3 for both libm.so.5.2  libm.so.5.3

I checked freebsd changelog and saw they mention that ldd has problem 
handling libc.so because ldd itself is linked with libc.so
http://svnweb.freebsd.org/base/head/usr.bin/ldd/ldd.c?view=log

Anyway, matthew's diff doesn't bail with exit 1 but bails out silently.

The modified diff which removes -x from manpage and code now follows...
I also deleted some unused headers.

thanks,
amit

Index: ldd.1
===
RCS file: /cvs/src/libexec/ld.so/ldd/ldd.1,v
retrieving revision 1.8
diff -u ldd.1
--- ldd.1   2 Mar 2009 09:27:34 -   1.8
+++ ldd.1   28 Apr 2011 22:20:04 -
@@ -32,7 +32,6 @@
 .Nd list dynamic object dependencies
 .Sh SYNOPSIS
 .Nm ldd
-.Op Fl x
 .Ar program ...
 .Sh DESCRIPTION
 .Nm
@@ -49,13 +48,6 @@
 and then execs
 .Ar program .
 .Pp
-If
-.Nm
-is invoked with the
-.Fl x
-flag, the tags from
-.Ar program
-are listed without using current ldconfig configuration.
 .Sh DIAGNOSTICS
 Exit status 0 if no error.
 Exit status 1 if arg error.
Index: ldd.c
===
RCS file: /cvs/src/libexec/ld.so/ldd/ldd.c,v
retrieving revision 1.14
diff -u ldd.c
--- ldd.c   2 Mar 2009 09:27:34 -   1.14
+++ ldd.c   28 Apr 2011 22:20:04 -
@@ -27,14 +27,10 @@
 #include stdio.h
 #include stdlib.h
 #include elf_abi.h
-#include err.h
 #include fcntl.h
-#include string.h
 #include unistd.h
 #include dlfcn.h
 
-#include sys/stat.h
-#include sys/mman.h
 #include sys/wait.h
 #include sys/param.h
 
@@ -44,23 +40,11 @@
 int
 main(int argc, char **argv)
 {
-   int c, xflag, ret;
+   int c, ret;
 
-   xflag = 0;
-   while ((c = getopt(argc, argv, x)) != -1) {
-   switch (c) {
-   case 'x':
-   xflag = 1;
-   break;
-   default:
-   usage();
-   /*NOTREACHED*/
-   }
-   }
+   while ((c = getopt(argc, argv, )) != -1)
+   ; /* EMPTY */
 
-   if (xflag)
-   errx(1, -x not yet implemented);
-
argc -= optind;
argv += optind;
 
@@ -84,7 +68,7 @@
 {
extern char *__progname;
 
-   fprintf(stderr, usage: %s [-x] program ...\n, __progname);
+   fprintf(stderr, usage: %s program ...\n, __progname);
exit(1);
 }
 
@@ -94,9 +78,9 @@
 {
Elf_Ehdr ehdr;
Elf_Phdr *phdr;
-   int fd, i, size, status, interp=0;
+   int fd, i, size, status, interp = 

Re: malloc: speed vs randomization

2011-04-26 Thread Amit Kulkarni
   This diff implements a tradeoff to gain speed at the cost of reducing
   the randomness of chunk allocation in malloc slightly.
  
   The idea is only to randomize the first half of chunks in a page. The
   second half of chunks will fill in the gaps in-order. The
   effectiveness of the current randomization decreases already with the
   number of free slots diminishing in a page.
  
   In one test case, this diff reduced the runtime from 31s to 25s. I'm
   not completely sure if the reduced randomness is acceptable. But if
  
  Perhaps a quarter?  We want to prevent adjacent consecutive
  allocations, which is still very likely at the half way point, but
  diminishes after that.
 
 Yes, that might be better, though you some of the performance gain is
 lost because you are scanning a lot of bits: i free bits + all bits in
 between that are not free. If a chunk page is pretty full, that's a
 lot of bits before you find the i'th free chunk. 
 
 Originally I though most of the time was lost getting the random bits,
 but now it seems the scanning of the bits is to blame. Unless I'm
 misinterpreting my data


Hi Otto,

Now that OpenBSD defaults to use bigmem it will suffer from small
page size on certain platforms like amd64, sparc64.

What do you guys think if the page size is dynamically adjusted to the 
datasize of FFS1 i.e when I fire up disklabel it is by default 16Kb 
on FFS1 on amd64. And higher on FFS2 only systems? Or you could base it
on RAM size detected on system. I don't know what's best, you guys know.

If then you combine with your original/tedu@ suggestion of consecutive
allocations nearing end of page, it would be best. Then, you have a 
bigger page size which packs in more allocations and also gives more 
room to randomize. Of course,there is the problem of guard pages where 
memory is wasted, if you have guard page size == normal page size. 
Guard page size could be current page size, which is 4096 on amd64.

This would take care of almost all scenarios, including supporting old 
drives, current disks with massive cache, and upcoming world of SSDs.
And the RAM sizes are exploding too, 8GB stick is becoming mainstream.

IMHO, increasing the page size will lead to bigger sustainable gains.

Thanks,
amit



Re: Compiling the kernel with pcc

2011-04-26 Thread Amit Kulkarni
Hi,

clang still can't compile the kernel on amd64 and presumably all 
other architectures. And I had sent a email to that effect to clang list.

I had a env CC=clang make clean  make depend  make in my 
build_kernel.sh file

it only works when you have env CC=clang make

The recent removal of make depend exposed me. I removed 
clean  depend as I do a rm -rf of the compile/GENERIC.MP folder

Sorry about that,
amit

On Mon, 4 Apr 2011, Philip Guenther wrote:

 On Mon, Apr 4, 2011 at 11:06 AM, Pascal Stumpf pascal.stu...@cubes.de wrote:
  pcc currently only chokes on some inline functions that need external
  linkage. gcc isn't pesky about that, but pcc and clang are (rightfully,
  imo).
 
 It's completely legal and defined (by the standard and not just gcc!)
 for a function to be inline in the file where it's defined and have
 external linkage.  That just means inline if you can in this file,
 but still provide a copy callable from other files.  That's exactly
 the semantic we want for pf_addr_compare().  If pcc or clang are
 complaining about it they're broken or their warning settings are
 misset.
 
 
 Philip Guenther



spurious sys/types.h include in man pages

2011-04-26 Thread Amit Kulkarni
Maybe I am missing something but the following manpages don't really need 
sys/types.h.

I compiled some small programs without sys/types.h.

thanks,
amit


Index: mincore.2
===
RCS file: /cvs/src/lib/libc/sys/mincore.2,v
retrieving revision 1.10
diff -u mincore.2
--- mincore.2   31 May 2007 19:19:33 -  1.10
+++ mincore.2   27 Apr 2011 03:22:38 -
@@ -37,7 +37,6 @@
 .Nm mincore
 .Nd determine residency of memory pages
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/mman.h
 .Ft int
 .Fn mincore void *addr size_t len char *vec
Index: minherit.2
===
RCS file: /cvs/src/lib/libc/sys/minherit.2,v
retrieving revision 1.13
diff -u minherit.2
--- minherit.2  31 May 2007 19:19:33 -  1.13
+++ minherit.2  27 Apr 2011 03:22:38 -
@@ -36,7 +36,6 @@
 .Nm minherit
 .Nd control the inheritance of pages
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/mman.h
 .Ft int
 .Fn minherit void *addr size_t len int inherit
Index: mlock.2
===
RCS file: /cvs/src/lib/libc/sys/mlock.2,v
retrieving revision 1.16
diff -u mlock.2
--- mlock.2 31 May 2007 19:19:33 -  1.16
+++ mlock.2 27 Apr 2011 03:22:38 -
@@ -38,7 +38,6 @@
 .Nm munlock
 .Nd lock (unlock) physical pages in memory
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/mman.h
 .Ft int
 .Fn mlock void *addr size_t len
Index: mlockall.2
===
RCS file: /cvs/src/lib/libc/sys/mlockall.2,v
retrieving revision 1.5
diff -u mlockall.2
--- mlockall.2  26 Jun 2008 05:42:05 -  1.5
+++ mlockall.2  27 Apr 2011 03:22:38 -
@@ -37,7 +37,6 @@
 .Nm munlockall
 .Nd lock (unlock) the address space of a process
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/mman.h
 .Ft int
 .Fn mlockall int flags
Index: mmap.2
===
RCS file: /cvs/src/lib/libc/sys/mmap.2,v
retrieving revision 1.38
diff -u mmap.2
--- mmap.2  11 Apr 2011 17:46:19 -  1.38
+++ mmap.2  27 Apr 2011 03:22:38 -
@@ -37,7 +37,6 @@
 .Nm mmap
 .Nd map files or devices into memory
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/mman.h
 .Ft void *
 .Fn mmap void *addr size_t len int prot int flags int fd off_t 
offset
Index: mprotect.2
===
RCS file: /cvs/src/lib/libc/sys/mprotect.2,v
retrieving revision 1.15
diff -u mprotect.2
--- mprotect.2  12 Feb 2010 21:49:10 -  1.15
+++ mprotect.2  27 Apr 2011 03:22:38 -
@@ -37,7 +37,6 @@
 .Nm mprotect
 .Nd control the protection of pages
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/mman.h
 .Ft int
 .Fn mprotect void *addr size_t len int prot
Index: msync.2
===
RCS file: /cvs/src/lib/libc/sys/msync.2,v
retrieving revision 1.21
diff -u msync.2
--- msync.2 31 May 2007 19:19:33 -  1.21
+++ msync.2 27 Apr 2011 03:22:38 -
@@ -37,7 +37,6 @@
 .Nm msync
 .Nd synchronize a mapped region
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/mman.h
 .Ft int
 .Fn msync void *addr size_t len int flags
Index: munmap.2
===
RCS file: /cvs/src/lib/libc/sys/munmap.2,v
retrieving revision 1.14
diff -u munmap.2
--- munmap.231 Jan 2009 16:52:15 -  1.14
+++ munmap.227 Apr 2011 03:22:38 -
@@ -37,7 +37,6 @@
 .Nm munmap
 .Nd remove a mapping
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/mman.h
 .Ft int
 .Fn munmap void *addr size_t len
Index: ptrace.2
===
RCS file: /cvs/src/lib/libc/sys/ptrace.2,v
retrieving revision 1.26
diff -u ptrace.2
--- ptrace.216 Sep 2008 19:41:06 -  1.26
+++ ptrace.227 Apr 2011 03:22:38 -
@@ -9,7 +9,6 @@
 .Nm ptrace
 .Nd process tracing and debugging
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/ptrace.h
 .Ft int
 .Fn ptrace int request pid_t pid caddr_t addr int data
Index: shmat.2
===
RCS file: /cvs/src/lib/libc/sys/shmat.2,v
retrieving revision 1.14
diff -u shmat.2
--- shmat.2 31 May 2007 19:19:33 -  1.14
+++ shmat.2 27 Apr 2011 03:22:38 -
@@ -38,7 +38,6 @@
 .Nm shmdt
 .Nd map/unmap shared memory
 .Sh SYNOPSIS
-.Fd #include sys/types.h
 .Fd #include sys/ipc.h
 .Fd #include sys/shm.h
 .Ft void *
Index: shmctl.2
===
RCS file: /cvs/src/lib/libc/sys/shmctl.2,v
retrieving revision 1.13
diff -u shmctl.2
--- shmctl.231 May 2007 19:19:33 -  1.13
+++ shmctl.227 Apr 2011 03:22:38 -
@@ -37,7 +37,6 @@
 .Nm shmctl
 .Nd shared memory control operations
 .Sh 

Re: ld.so speedup for large binaries with many shared libraries

2011-04-24 Thread Amit Kulkarni
from http://www.feyrer.de/NetBSD/bx/blosxom.cgi/index.front?-tags=arm

http://mail-index.netbsd.org/tech-userlevel/2010/02/24/msg003325.html

it will help Chrome, Firefox, Webkit, GNOME, KDE, LibreOffice, vlc
(and similar monsters like those)

On Sun, Apr 24, 2011 at 4:53 AM, Antti Harri i...@openbsd.fi wrote:
 Hi,

 seems to work on amd64. Not sure if it made things quicker, but it sure didn't
 make them slower either.

 --
 Antti Harri



Re: namespace.h

2011-04-21 Thread Amit Kulkarni
Just commented out that shm_open/unlink portion, yes yes I know its
bad. But spawn.h + posix_spawn.c is absolutely needed for lldb, and it
will need heavy commentary from the header guys just like the recent
fenv.h changes. Otherwise, the alternative is ugly and a patch set for
lldb which is more of a pain to maintain in years ahead.

Its much easier to see if FreeBSD can work as close to unmodified
before I revisit this issue back on tech@ to see how to bring initial
support lldb in OpenBSD. Will have gained some more experience then.

On Thu, Apr 21, 2011 at 1:15 PM, Matthew Dempsky matt...@dempsky.org wrote:
 On Thu, Apr 21, 2011 at 11:14 AM, Matthew Dempsky matt...@dempsky.org
wrote:
 We do have shmat(2) and shmdt(2).  Can't you just use
 open(2)/unlink(2) instead of shm_open()/shm_unlink()?

 Bah, ignore that.



Re: namespace.h

2011-04-20 Thread Amit Kulkarni
 where is a listing of all functions implemented in openbsd's libc?

 nm /usr/lib/libc.a ?  What's the question that you're *really*
 trying to answer?

Just trying to see how far llvm+lldb goes in compilation. Right now
its stuck because of missing spawn.h in OpenBSD. And also stuck in
shm_open, shm_unlink, all of which Posix states is optional.

Is it then okay for me to update namespace.h to what I can glean from
/usr/src?

 Is src/lib/libc/include/namespace.h consist of functions not implemented
or
 its a relic?

 Neither.


 i was looking for the equivalent of FreeBSD's file of the same name and
 location but in OpenBSD.

 Our file serves the same purpose as FreeBSD's, it's just incomplete.
 What do you think FreeBSD is for/does?

I think FreeBSD's file is present so it can handle posix compatibility
by undef all of the namespace functionality in un-namespace.h. Beyond
that I haven't looked.

Thanks



namespace.h

2011-04-19 Thread Amit Kulkarni
hi,

where is a listing of all functions implemented in openbsd's libc? Is 
src/lib/libc/include/namespace.h consist of functions not implemented or 
its a relic?

i was looking for the equivalent of FreeBSD's file of the same name and 
location but in OpenBSD.

thanks,
amit



Making clang++ actually work ( WAS Re: Changing llvm/clang++ to use libstdc++ from ports instead of base)

2011-04-08 Thread Amit Kulkarni
Hi,

referring to http://marc.info/?l=openbsd-portsm=129789998809077w=2

I am in the process of updating llvm+clang from 2.8 == 2.9
(maintainer is ports@). Done the update on amd64 (will need some
feedback) sending as a separate email. Update can go in or not, but
clang++ C++ compiler absolutely needs a fix to be usable, clang C
compiler works.

Right now llvm+clang 2.8 is unusable as C++ compiler on OpenBSD
because of gcc issue highlighted in below thread on ports@. This issue
of C++ compilation issues was initially brought up and then solved by
matthew@ when prodded by espie@. I am using the version of
os_defines.h in that thread and rebuilt userland since that time and
clang++ works fine. matthew@ committed the first part of that fix for
libc __cxa_finalize. This part needs consensus approval from you guys
since it might affect stuff negatively.

I am kickstarting this discussion in the hope this gets resolved
sometime before 5.0 lock and I don't have to see a M next to this file
when I do a cvs up.

Investigating this CHECK / DYNAMIC defines in the other BSDs brings
the foll results. FreeBSD has almost the same file as OpenBSD. So if
anybody has access to a FreeBSD stable 8.2 or current system, can they
compile any C++ program with Clang++ 2.9? This is because their header
file is same as OpenBSD.


freebsd-src-current/head/contrib/libstdc++/config/os/bsd/freebsd/os_defines.h

GCC 4.2.1? present (Can clang++ actually compile C++ programs? verify)

#define _GLIBCXX_USE_C99_CHECK 1
#define _GLIBCXX_USE_C99_DYNAMIC (!(__ISO_C_VISIBLE = 1999))
#define _GLIBCXX_USE_C99_LONG_LONG_CHECK 1
#define _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC (_GLIBCXX_USE_C99_DYNAMIC ||
!defined
 __LONG_LONG_SUPPORTED)
#define _GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK 1
#define _GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC defined _XOPEN_SOURCE


dragonfly-src-current/src/contrib/gcc-4.4/libstdc++-v3/config/os/bsd/dragonfl
y/os_defines.h

GCC 4.4.5? note how its completely absent on 4.4.5

/* JRM: Disable all these defines until better understood.
They work fine on vendor gcc 4.6, but break buildworld.
#define _GLIBCXX_USE_C99 1
#define _GLIBCXX_USE_C99_CHECK 1
#define _GLIBCXX_USE_C99_DYNAMIC (!(__ISO_C_VISIBLE = 1999))
#define _GLIBCXX_USE_C99_LONG_LONG_CHECK 1
#define _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC (_GLIBCXX_USE_C99_DYNAMIC
|| !defined __LONG_LONG_SUPPORTED)
#define _GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK 1
#define _GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC defined _XOPEN_SOURCE
*/

netbsd-src-current/src/gnu/dist/gcc4/libstdc++-v3/config/os/bsd/netbsd/os_def
ines.h

gcc 4.1.* or 4.2.1? absent with additional define

#ifndef _GLIBCXX_OS_DEFINES
#define _GLIBCXX_OS_DEFINES 1

// System-specific #define, typedefs, corrections, etc, go here.  This
// file will come before all others.

#define __ssize_t ssize_t



/usr/src/gnu/gcc/libstdc++-v3/config/os/bsd/openbsd/os_defines.h

GCC 4.2.1, present with additional typedef

#define _GLIBCXX_USE_C99 1
#define _GLIBCXX_USE_C99_CHECK 1
#define _GLIBCXX_USE_C99_DYNAMIC (!(__ISO_C_VISIBLE = 1999))
#define _GLIBCXX_USE_C99_LONG_LONG_CHECK 1
#define _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC (_GLIBCXX_USE_C99_DYNAMIC ||
!defined
 __LONG_LONG_SUPPORTED)
#define _GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK 1
#define _GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC defined _XOPEN_SOURCE

typedef __builtin_va_list __gnuc_va_list;

Note, matthew@ said in another email in that thread that Ubuntu
doesn't define it.

Will some guru please find time to investigate this issue?

Thanks,
amit

P.S I am trying to see how far clang goes in compiling userland 
xenocara, the latter of which needs clang++ as CXX compiler.


On Tue, Feb 15, 2011 at 10:13 PM, Matthew Dempsky matt...@dempsky.org
wrote:
 On Tue, Feb 15, 2011 at 11:57:44PM +0100, Marc Espie wrote:
 Finding out why it doesn't work with g++ from src would be
 the correct approach.

 Investigating further, it seems to boil down to _GLIBCXX_USE_C99_CHECK
 in /usr/include/g++/*/bits/os_defines.h.  Including cstdio from base
 effectively feeds the following to the compiler:

typedef unsigned long size_t;

extern C {
int snprintf(char *, size_t, const char *, ...)
  __attribute__((__format__ (printf, 3, 4)))
  __attribute__((__nonnull__ (3)))
  __attribute__((__bounded__ (__string__,1,2)));
}

namespace std {
  extern C int
  (snprintf)(char * restrict, size_t, const char * restrict, ...);
  using ::snprintf;
}

 g++ doesn't mind the conflicting definitions for snprintf (it doesn't
 seem to care about *any* conflicts actually; you could put int
 snprintf(double x) and it's still happy), but clang++ does.

 We don't define anything in os_defines.h in libstdc++ from ports, so
 the the second 

Re: fsck_ffs diff needs testing

2011-04-07 Thread Amit Kulkarni
 Also, depending on the usage patterns, you might have a fs where high
 numbered inodes are used, while the fs itself is pretty empty. Filling
 up a fs with lots of files and them removing a lot of them is an
 example that could lead to such a situation. This diff does not speed
 things up in such cases.

 ...might have an impact in my case, since I often do things like rebuilding
 the system including tons of packages on this machine, and that use case of
 course perfectly matches what you say above. I think I'll remake these file
 systems and run the test again just to satisfy my curiosity. But that'll
 have to wait until after dinner. :-)

 Anyway, I see improvements both in memory usage and in speed, and I see no
 obvoius malfunctions, so I'd say it's a go.


Hi Otto,

Comparing your diff with FreeBSD svn (not cvs, they dropped cvs! my
bad on the initial comment) after Benny pointed this out.

http://svn.freebsd.org/viewvc/base/head/sbin/fsck_ffs/pass1.c?revision=201708view=markup

Look at this comment inside the file

/*
 * This optimization speeds up future runs of fsck
 * by trimming down the number of inodes in cylinder
 * groups that formerly had many inodes but now have
 * fewer in use.
 */

and the commit entry by McKusick for rev 188110

Update the actions previously attempted by the -D option to make them
robust. With these changes fsck is now able to detect and reliably
rebuild corrupted cylinder group maps. The -D option is no longer
necessary as it has been replaced by a prompt asking whether the
corrupted cylinder group should be rebuilt and doing so when requested.
These actions are only offered and taken when running fsck in manual
mode. Corrupted cylinder groups found during preen mode cause the fsck
to fail.

Add the -r option to free up excess unused inodes. Decreasing the
number of preallocated inodes reduces the running time of future
runs of fsck and frees up space that can allocated to files. The -r
option is ignored when running in preen mode.
--

Will you please please please integrate that part of the code too!!!
This is absolutely useful to have and is a fairly common situation. It
will make fsck better and be a good way to partially defrag your fs.

Or did you plan to keep that for later after more testing on this diff?

Thanks,
amit



Re: fsck_ffs diff needs testing

2011-04-07 Thread Amit Kulkarni
Thanks! I understand.

On Thu, Apr 7, 2011 at 11:09 AM, Otto Moerbeek o...@drijf.net wrote:
 On Thu, Apr 07, 2011 at 11:05:42AM -0500, Amit Kulkarni wrote:

  Also, depending on the usage patterns, you might have a fs where high
  numbered inodes are used, while the fs itself is pretty empty. Filling
  up a fs with lots of files and them removing a lot of them is an
  example that could lead to such a situation. This diff does not speed
  things up in such cases.
 
  ...might have an impact in my case, since I often do things like
rebuilding
  the system including tons of packages on this machine, and that use case
of
  course perfectly matches what you say above. I think I'll remake these
file
  systems and run the test again just to satisfy my curiosity. But that'll
  have to wait until after dinner. :-)
 
  Anyway, I see improvements both in memory usage and in speed, and I see
no
  obvoius malfunctions, so I'd say it's a go.


 Hi Otto,

 Comparing your diff with FreeBSD svn (not cvs, they dropped cvs! my
 bad on the initial comment) after Benny pointed this out.


http://svn.freebsd.org/viewvc/base/head/sbin/fsck_ffs/pass1.c?revision=201708
view=markup

 Look at this comment inside the file

 /*
* This optimization speeds up future runs of fsck
* by trimming down the number of inodes in cylinder
* groups that formerly had many inodes but now have
* fewer in use.
*/

 and the commit entry by McKusick for rev 188110

 Update the actions previously attempted by the -D option to make them
 robust. With these changes fsck is now able to detect and reliably
 rebuild corrupted cylinder group maps. The -D option is no longer
 necessary as it has been replaced by a prompt asking whether the
 corrupted cylinder group should be rebuilt and doing so when requested.
 These actions are only offered and taken when running fsck in manual
 mode. Corrupted cylinder groups found during preen mode cause the fsck
 to fail.

 Add the -r option to free up excess unused inodes. Decreasing the
 number of preallocated inodes reduces the running time of future
 runs of fsck and frees up space that can allocated to files. The -r
 option is ignored when running in preen mode.
 --

 Will you please please please integrate that part of the code too!!!
 This is absolutely useful to have and is a fairly common situation. It
 will make fsck better and be a good way to partially defrag your fs.

 Or did you plan to keep that for later after more testing on this diff?

 Thanks,
 amit

 Yes, I go step by step.

-Otto



Re: Compiling the kernel with pcc

2011-04-04 Thread Amit Kulkarni
On Mon, 4 Apr 2011, Alexander Bluhm wrote:

 On Mon, Apr 04, 2011 at 08:06:57PM +0200, Pascal Stumpf wrote:
  net/pf.c:   pf_addr_compare (was probably ok before r1.729)
 
 The current implementation has been discussed.  See also:
 http://www.greenend.org.uk/rjk/2003/03/inline.html
 
 The function should be inline within pf.c and callable from pf_norm.c.
 Defining it inline in pf.c allows faster code there, declaring it
 non-inline in pfvar.h creates callable code in pf.o.  gcc always
 generates callable code, so it does not matter here.
 
 Do pcc and clang compile and link if you put an inline in the pfvar.h
 declaration?

clang trunk v128001 has no problem if I __inline in pfvar.h.

FYI, I am running on a kernel built using that clang revision.

 
  Do those *have* to be inline?
 
 Yes, pf_addr_compare() is used in pf state lookup.  So it must be
 fast.  The generated RB functions use it inline.
 
 bluhm



Re: horribly slow fsck_ffs pass1 performance

2011-04-02 Thread Amit Kulkarni
Hi,

I am replying in a single email.

I do a fsck once in a while, not regular. In the last 6-8 months I
might have done it about 5 times. And I did it multi-user the few
times I did it, but plan on doing it single user in future and I do
plan to do it monthly. After seeing the messages when you fsck, it is
better to do it monthly. FreeBSD which is the origin of FFS does a
background fsck, and if Kirk McCusick feels so strongly I will do it
too. (I remember somebody talking about having background fsck here on
a openbsd list, but I forgot who it was).

FS code in OpenBSD is mature and appears to be better than on FreeBSD.
Linux has a problem with fsync() on ext3 (maybe even ext4), that is
why they do it so often. I read that they go for more speed and pay
less attention to data integrity. I was new to OpenBSD since about 6-8
months, so I will try it out. I don't have anything important on that
OpenBSD machine, everything is backed up safely. Once I am fully
satisfied I won't do it monthly, maybe less or most likely never. I
will be experimenting with fsck since that new code change by Otto at
least for the next few months. You guys know the limits and
capabilities. So *you* don't, some others might or might not. But I am
learning and wanting to be on a stable virus free, trojan free,
crapware free machine. The choice for me is one of the BSD's. What is
a new guy to know?

Thanks,
amit

On Sat, Apr 2, 2011 at 10:46 AM, Benny Lofgren bl-li...@lofgren.biz wrote:
 On 2011-04-01 19.03, Amit Kulkarni wrote:
 Thank you Arthur and the team for a very fast turnaround! Thank you
 for reducing the pain. I will schedule a fsck every month or so,
 knowing it won't screw up anything and be done really quick.

 Why schedule fsck runs at all? The file system code is very mature and
 although of course it would be unwise to declare it bug free, I see very
 little reason to run fsck on a file system unless there have been some
 problem like an unclean shutdown to prompt it (in which case of course,
 the system does it for you automatically when rebooting).

 I've noticed that some (all?) linux systems do uncalled-for file system
 checks at boot if no check have been made recently, but I've never
 understood this practice. It must mean they don't trust their own file
 systems, which frankly I find a bit unsettling... I'd rather use a file
 system that's been field proven for decades than use something thats
 just come out of the experimenting shop.


 Regards,
 /Benny

 --
 internetlabbet.se / work:   +46 8 551 124 80  / Words must
 Benny Lvfgren/  mobile: +46 70 718 11 90 /   be weighed,
/   fax:+46 8 551 124 89/not counted.
   /email:  benny -at- internetlabbet.se



  1   2   >