Re: pool related crashes, but "kernel did no panic"

2016-06-09 Thread Alexey Suslikov
On Tue, May 31, 2016 at 7:16 PM, Theo de Raadt  wrote:
>> is exactly 80 characters long (such a long printf violates "80 chars"
>> rule, isn't it?).
>
> there is no hard and fast rule for that at all; printing extra newlines
> has other downsides such as the screen scrolling sooner.

Hi. I finally have a trace with pfsync related panic. See here

http://article.gmane.org/gmane.os.openbsd.bugs/23666



Re: pool related crashes, but "kernel did no panic"

2016-05-31 Thread Alexey Suslikov
On Mon, May 30, 2016 at 9:02 PM, Ted Unangst <t...@tedunangst.com> wrote:
> Alexey Suslikov wrote:
>> On Thu, May 12, 2016 at 4:14 PM, Bob Beck <b...@openbsd.org> wrote:
>> > Thank you!now that's a bug report..
>>
>> Hi.
>>
>> Moved to 6.0-beta some time ago to make crash dumps more up
>> to date. Also, removed some services to minimize their impact.
>>
>> Fresh build against today's cvs don't survived even half of the day.
>>
>> http://article.gmane.org/gmane.os.openbsd.bugs/23593
>>
>> For me, it looks like: 5.7-5.8 - rare crashes, 5.9-6.0 - more frequent
>> crashes.
>>
>> Backtrace differs from crash to crash, but this remains the same:
>>
>> Stopped at  pool_put+0x1dd: xorq0x8(%rax),%rcx
>>
>> Do you have any idea where should I look in a source code?
>
> sys/kern/subr_pool.c

Thanks for your replies. Especially Stefan who noticed "show pools"
output being truncated for some reason.

Here, kernel output is redirected to com, which is redirected to kvm,
browser with java applet is connected to kvm. This is how I get it.

amappl1: pool(0x81974640:amappl1): page inconsistency: page 0xff01e0

is exactly 80 characters long (such a long printf violates "80 chars"
rule, isn't it?).

Maybe there's a bug in kvm (java applet?) and output gets truncated.

Anyway, let's see, because now I run with the following:

Index: sys/kern/subr_pool.c
===
RCS file: /cvs/src/sys/kern/subr_pool.c,v
retrieving revision 1.194
diff -u -p -u -p -r1.194 subr_pool.c
--- sys/kern/subr_pool.c15 Jan 2016 11:21:58 -1.194
+++ sys/kern/subr_pool.c31 May 2016 09:10:21 -
@@ -1160,7 +1160,8 @@ pool_chk_page(struct pool *pp, struct po
 page = (caddr_t)((u_long)ph & pp->pr_pgmask);
 if (page != ph->ph_page && POOL_INPGHDR(pp)) {
 printf("%s: ", label);
-printf("pool(%p:%s): page inconsistency: page %p; "
+printf("pool(%p:%s):\n"
+"page inconsistency: page %p;\n"
 "at page head addr %p (p %p)\n",
 pp, pp->pr_wchan, ph->ph_page, ph, page);
 return 1;
@@ -1172,9 +1173,10 @@ pool_chk_page(struct pool *pp, struct po
 if ((caddr_t)pi < ph->ph_page ||
 (caddr_t)pi >= ph->ph_page + pp->pr_pgsize) {
 printf("%s: ", label);
-printf("pool(%p:%s): page inconsistency: page %p;"
-" item ordinal %d; addr %p\n", pp,
-pp->pr_wchan, ph->ph_page, n, pi);
+printf("pool(%p:%s):\n"
+"page inconsistency: page %p;\n"
+"item ordinal %d; addr %p\n",
+pp, pp->pr_wchan, ph->ph_page, n, pi);
 return (1);
 }

@@ -1204,16 +1206,18 @@ pool_chk_page(struct pool *pp, struct po
 #endif /* DIAGNOSTIC */
 }
 if (n + ph->ph_nmissing != pp->pr_itemsperpage) {
-printf("pool(%p:%s): page inconsistency: page %p;"
-" %d on list, %d missing, %d items per page\n", pp,
-pp->pr_wchan, ph->ph_page, n, ph->ph_nmissing,
+printf("pool(%p:%s):\n"
+"page inconsistency: page %p;\n"
+"%d on list, %d missing, %d items per page\n",
+pp, pp->pr_wchan, ph->ph_page, n, ph->ph_nmissing,
 pp->pr_itemsperpage);
 return 1;
 }
 if (expected >= 0 && n != expected) {
-printf("pool(%p:%s): page inconsistency: page %p;"
-" %d on list, %d missing, %d expected\n", pp,
-pp->pr_wchan, ph->ph_page, n, ph->ph_nmissing,
+printf("pool(%p:%s):\n"
+"page inconsistency: page %p;\n"
+"%d on list, %d missing, %d expected\n",
+pp, pp->pr_wchan, ph->ph_page, n, ph->ph_nmissing,
 expected);
 return 1;
 }



Re: pool related crashes, but "kernel did no panic"

2016-05-30 Thread Alexey Suslikov
On Thu, May 12, 2016 at 4:14 PM, Bob Beck  wrote:
> Thank you!now that's a bug report..

Hi.

Moved to 6.0-beta some time ago to make crash dumps more up
to date. Also, removed some services to minimize their impact.

Fresh build against today's cvs don't survived even half of the day.

http://article.gmane.org/gmane.os.openbsd.bugs/23593

For me, it looks like: 5.7-5.8 - rare crashes, 5.9-6.0 - more frequent
crashes.

Backtrace differs from crash to crash, but this remains the same:

Stopped at  pool_put+0x1dd: xorq0x8(%rax),%rcx

Do you have any idea where should I look in a source code?

Thanks.



Re: pool related crashes, but "kernel did no panic"

2016-05-13 Thread Alexey Suslikov
On Fri, May 13, 2016 at 3:59 AM, David Gwynne <da...@gwynne.id.au> wrote:
>
>> On 12 May 2016, at 20:28, Alexey Suslikov <alexey.susli...@gmail.com> wrote:
>>
>> On Wed, Apr 27, 2016 at 7:22 PM, Theo de Raadt <dera...@cvs.openbsd.org> 
>> wrote:
>>>> On 27/04/16(Wed) 15:45, Alexey Suslikov wrote:
>>>>> Theo de Raadt  cvs.openbsd.org> writes:
>>>>>
>>>>>>
>>>>>> Most of these bug reports completely stink.
>>>>>>
>>>>>> ALWAYS include *ALL* information in a report.
>>>>>
>>>>> In an idealistic world, yes.
>>>>
>>>> In an idealistic world their would be no bug.
>>>
>>> In an idealistic world, Alexey Suslikov wouldn't feel compelled to
>>> defend sloppiness.
>>
>> follow up is here
>>
>> http://marc.info/?l=openbsd-bugs=146304833425471=2
>> http://marc.info/?l=openbsd-bugs=146304864925575=2
>>
>
> this shoudl be fixed in stable. can you make sure you have the following:
>
> http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/kern/uipc_mbuf.c.diff?r1=1.219=1.219.2.1

what do you think about this (new) one

http://marc.info/?l=openbsd-bugs=146312050712969=2

I really can do more to debug this and asked for an advice
from the begging of this thread.



Re: pool related crashes, but "kernel did no panic"

2016-05-12 Thread Alexey Suslikov
On Wed, Apr 27, 2016 at 7:22 PM, Theo de Raadt <dera...@cvs.openbsd.org> wrote:
>> On 27/04/16(Wed) 15:45, Alexey Suslikov wrote:
>> > Theo de Raadt  cvs.openbsd.org> writes:
>> >
>> > >
>> > > Most of these bug reports completely stink.
>> > >
>> > > ALWAYS include *ALL* information in a report.
>> >
>> > In an idealistic world, yes.
>>
>> In an idealistic world their would be no bug.
>
> In an idealistic world, Alexey Suslikov wouldn't feel compelled to
> defend sloppiness.

follow up is here

http://marc.info/?l=openbsd-bugs=146304833425471=2
http://marc.info/?l=openbsd-bugs=146304864925575=2



Re: pool related crashes, but "kernel did no panic"

2016-04-27 Thread Alexey Suslikov
Theo de Raadt  cvs.openbsd.org> writes:

> 
> Most of these bug reports completely stink.
> 
> ALWAYS include *ALL* information in a report.

In an idealistic world, yes.

Above are not parts of the "chain", but different statements of the
same bug. To have both blue screen and ddb, I need to keep kvm console
running in a browser for undefined period of time (crash can occur twice
per day, or once per 2 months), which isn't as easy as it seems.

But sure I'll try to fill more complete report.



Re: pool related crashes, but "kernel did no panic"

2016-04-27 Thread Alexey Suslikov
Stuart Henderson  spacehopper.org> writes:

> There should be some lines printed before you get dumped into DDB
> (probably a uvm_fault), the information in them is important.

I either have a screenshot, or ddb. Not both at the same time.

Here is one of screenshots from 5.9 transcribed:

uvm_fault(0x81940240, 0x10, 0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip 811a5c3e cs 8 rflags 10206 cr 2 10 cpl 
a rsp 800022171e20
panic: trap type 6, code=0, pc=811a5c3e
Starting stack trace...
panic() at panic+0x10b
trap() at trap+0x7b8
--- trap (number 6) ---
pool_p_free() at pool_p_free+0x7e
pool_gc_pages() at pool_gc_pages+0xe4
taskq_thread() at taskq_thread+0x6c
end trace frame: 0x0, count: 252
End of stack trace.
syncing disks... 5 done



Re: pool related crashes, but "kernel did no panic"

2016-04-27 Thread Alexey Suslikov
Another one from my collection.

Apr 16:

ddb{0}> show panic
the kernel did not panic

ddb{0}> trace
pool_do_get() at pool_do_get+0x90
pool_get() at pool_get+0xb5
m_get() at m_get+0x28
sbappendaddr() at sbappendaddr+0x9a
uipc_usrreq() at uipc_usrreq+0x3b8
sosend() at sosend+0x3d8
dosendsyslog() at dosendsyslog+0x110
sys_sendsyslog2() at sys_sendsyslog2+0xbd
syscall() at syscall+0x368
--- syscall (number 112) ---
end of kernel
end trace frame: 0x183f8dab6913, count: -9
0x1842755e571a:

ddb{0}> show registers
rdi  0x7
rsi   0x9ff5c49ed229ae92
rbp   0x8000222f5b00
rbx   0xff022d80d6d0
rdx   0x8000222f5b64
rcx   0x818c76e0cpu_info_primary
rax   0x7293fa06e984af44
r8 0
r9   0x1
r10   0x811c7c00uipc_usrreq
r11   0x81344be0copy_fault
r12   0x8194c000mbpool
r13   0xff40b152a900
r14  0x2
r15   0x818b4570sun_noname
rip   0x811a5340pool_do_get+0x90
cs   0x8
rflags   0x10282__ALIGN_SIZE+0xf282
rsp   0x8000222f5ab0
ss  0x10
pool_do_get+0x90:   movq0(%r13),%rdi



pool related crashes, but "kernel did no panic"

2016-04-27 Thread Alexey Suslikov
Hi tech@.

(Maybe related to http://marc.info/?l=openbsd-bugs=146174654219490=2).

Crashing server acts as a carp backup (master has same hardware config but
don't crash, in contrast to backup). Will post additional information if
necessary.

There's a collection of crashes (including pre 5.9) but see below for most
recent ones.

Any advice to track down the issue?

Thanks,
Alexey


OpenBSD 5.9-stable (GENERIC.MP) #0: Sun Mar 27 16:03:33 EEST 2016
***@***:/usr/src/sys/arch/amd64/compile/GENERIC.MP


Apr 15:

ddb{2}> show panic
the kernel did not panic

ddb{2}> trace
pool_do_get() at pool_do_get+0x90
pool_get() at pool_get+0xb5
ffs_vget() at ffs_vget+0xa7
ufs_lookup() at ufs_lookup+0x36f
VOP_LOOKUP() at VOP_LOOKUP+0x39
vfs_lookup() at vfs_lookup+0x277
namei() at namei+0x24c
dofstatat() at dofstatat+0x94
syscall() at syscall+0x368
--- syscall (number 40) ---
end of kernel
end trace frame: 0x45ea97030a0, count: -9
0x45e29dc70fa:

ddb{2}> show registers
rdi   0x
rsi   0x957581e21a424e5c
rbp   0x8000224a2a10
rbx   0xff02290e7810
rdx   0x8000224a2a74
rcx   0x80067000
rax   0x5abd427fd20d77f3
r8  0x30
r9 0
r100
r11   0x8000224a2a10
r12   0x819694c0ffs_ino_pool
r13   0xff122c4b0968
r14  0x9
r150x407
rip   0x811a5340pool_do_get+0x90
cs   0x8
rflags   0x10286__ALIGN_SIZE+0xf286
rsp   0x8000224a29c0
ss  0x10
pool_do_get+0x90:   movq0(%r13),%rdi


Apr 23:

ddb{2}> show panic
the kernel did not panic

ddb{2}> trace
pool_p_free() at pool_p_free+0x7e
pool_gc_pages() at pool_gc_pages+0xe4
taskq_thread() at taskq_thread+0x6c
end trace frame: 0x0, count: -3

ddb{2}> show registers
rdi   0x8194c000mbpool
rsi   0x60329ee8bc5a0776
rbp   0x800022171e70
rbx   0xff009e7b3300
rdx   0x9fcd61e822213476
rcx   0xddbc8af92f3ff41a
rax 0x10
r8   0x1
r90xff0108eeda00
r10  0x1
r11   0x811a3e70pool_page_free
r12   0xff022d8b7a50
r130
r14   0x8194c000mbpool
r15   0x800022171e30
rip   0x811a5c3epool_p_free+0x7e
cs   0x8
rflags   0x10206__ALIGN_SIZE+0xf206
rsp   0x800022171e20
ss  0x10
pool_p_free+0x7e:   movq0(%rax),%rsi



Re: Optimize pledge-related notes in 59.html

2016-02-18 Thread Alexey Suslikov
Theo de Raadt  openbsd.org> writes:

> so thanks for your suggestion.  have you ever noticed how suggestions
> are taken less seriously when they are not formatted as a diff?

--- 59.html.origThu Feb 18 11:45:24 2016
+++ 59.html Thu Feb 18 12:03:29 2016
@@ -100,21 +100,21 @@
 http://www.openbsd.org/cgi-bin/man.cgi?
query=dhclientsektion=8">dhclient(8) no longer exits if a desired 
route cannot be added. It now just reports the fact.
 http://www.openbsd.org/cgi-bin/man.cgi?
query=dhclientsektion=8">dhclient(8) now takes a much more careful 
approach to received packets to ensure only received data is used to process 
the packet. Packets with incorrect length information or lacking appropriate 
header information are now dropped.
 http://www.openbsd.org/cgi-bin/man.cgi?
query=dhclientsektion=8">dhclient(8) again disables pending 
timeouts if the interface link is lost, preventing endless retries at 
obtaining a lease.
-http://www.openbsd.org/cgi-bin/man.cgi?
query=dhclientsektion=8">dhclient(8) was pledged.
 http://www.openbsd.org/cgi-bin/man.cgi?
query=dhcpdsektion=8">dhcpd(8) again properly utilizes default-
lease-time, max-lease-time and bootp-lease-time options.
-http://www.openbsd.org/cgi-bin/man.cgi?
query=dhcpdsektion=8">dhcpd(8) was pledged.
 ...
 
 
 
 Security improvements:
 
-...
+http://www.openbsd.org/cgi-bin/man.cgi/?
query=pledge">pledge(2), a new subsystem for restricting operations in 
programs, was added.
+More than 200 daemons and programs was pledged, among them: http://www.openbsd.org/cgi-bin/man.cgi?
query=dhclient=8">dhclient(8), http://www.openbsd.org/cgi-bin/man.cgi?
query=dhcpd=8">dhcpd(8), http://www.openbsd.org/cgi-
bin/man.cgi?query=fdisk=8">fdisk(8), http://www.openbsd.org/cgi-bin/man.cgi/?
query=pdisk=8=macppc">pdisk(8).
 Support for looking up hosts via YP has been removed from libc.
   The 'yp' lookup method in
   http://www.openbsd.org/cgi-bin/man.cgi?
query=resolv.conf=5">resolv.conf
   is no longer available.
 Support for the HOSTALIASES environment variable has been removed 
from libc.
+   ...
 
 
 
@@ -123,7 +123,7 @@
 doas is a little friendlier to use
 Updated flex
 Updated and improved less
-http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-
current/man8/macppc/pdisk.8?query=pdisk">pdisk(8) was largely rewritten 
and pledged.
+http://www.openbsd.org/cgi-bin/man.cgi/?
query=pdisk=8=macppc">pdisk(8) was largely rewritten.
 Renaming files in the root directory of a MSDOS filesystem was 
fixed.
 Many obsolete http://www.openbsd.org/cgi-
bin/man.cgi/OpenBSD-current/man5/disktab.5?query=disktab">disktab(5) 
attributes and entries were removed.
 http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-
current/man4/softraid.4?query=softraid">softraid(4) volumes now 
correctly look for the disklabel in the first OpenBSD disk partition, not 
the last.
@@ -132,7 +132,6 @@
 http://www.openbsd.org/cgi-bin/man.cgi?
query=fdisksektion=8">fdisk(8) now has a '-b' flag that specifies 
the size of the EFI System partition to create.
 http://www.openbsd.org/cgi-bin/man.cgi?
query=fdisksektion=8">fdisk(8) now has a '-v' flag that causes a 
verbose display of both MBR and GPT information.
 http://www.openbsd.org/cgi-bin/man.cgi?
query=fdisksektion=8">fdisk(8) now provides full interactive GPT 
editing.
-http://www.openbsd.org/cgi-bin/man.cgi?
query=fdisksektion=8">fdisk(8) was pledged.
 Disks with sector sizes other than 512 bytes can now be partitioned 
with a GPT.
 The GPT kernel option was removed and GPT support is part of all 
GENERIC and GENERIC derived kernels.
 Many improvements were made to the GPT kernel support to ensure 
safe and reliable operation of GPT and MBR processing.



Optimize pledge-related notes in 59.html

2016-02-17 Thread Alexey Suslikov
Hi tech@.

pledge itself is a security feature, so maybe it is better to put
pledge under "Security improvements", like

"More than 200 daemons and programs was pledged, among them:
dhclient, dhcpd, fdisk" etc

I found myself happy to understand what
a) pledge is a security feature,
b) how deep pledge usage is,
from a single block of text, not collecting pieces here and there.



Re: Make em(4) more mpsafe again

2016-01-14 Thread Alexey Suslikov
Juuso Lapinlampi  partyvan.eu> writes:

> > - * These parameters control when the driver calls the routine to reclaim
> > - * transmit descriptors.
> > + * Thise parameter controls the minimum number of available transmit
> > + * descriptors needed before we attempt transmission of a packet.
> >   */
> 
> There seems to be a typo in there. s/Thise/This/.

Hey, it's fun to copy-paste all around.

http://marc.info/?l=openbsd-tech=144362501114184=2



Re: pledge telnet

2015-11-13 Thread Alexey Suslikov
On Fri, Nov 13, 2015 at 8:40 PM, Theo de Raadt  wrote:
>> > On 2015/11/13 09:59, Theo de Raadt wrote:
>> > > > > I really want to delete telnet entirely,
>> > > >
>> > > > I often use it for testing unencrypted SMTP and HTTP across the
>> > > > Internet.  Which tool would you recommend for that purpose?
>> > >
>> > > nc(1).
>> > I use telnet fairly often for connecting to things like crappy switches,
>> > crappy routers, APs of varying crappiness, etc. nc -t isn't close to being
>> > good enough for this, also with nc it's difficult to send things like ^C
>> > (even worse, if you use it much you forget about this and end up killing
>> > your connection). I wouldn't mind having it removed from base, but would
>> > need to go in ports unless nc gets a lot of polishing.
>>
>> I always thought of telnet as a kind of discipline over the wire. There are
>> even extensions (like RFC 2217) well-fitting discipline model.
>
> Like a horse buggy in the inside lane of a 4-lane highway, there are going
> to fatalities.
>
> "discipline" applies to the user of this code -- it means "avoid any and all
> unnecessary use".
>
>> >From other hand, nc(1) is a "raw" tool with decent client-server model.
>>
>> Is there any possibility to run nc(1) as a privsep server, and a telnet(1) as
>> a client, talking to nc(1) server via IMSG (instead of doing network stuff
>> directly)?
>
> What's the goal.  To continue the lifetime of telne?  To make the nc code
> more complicated and fragile?  Those are the only outcomes I see.

It is similar to (optional) XMODEM/ZMODEM disciplines over serial, IMO.

The goal is to delete classic telnet entirely and make it an
(optional) discipline
frontend for nc(1). In "telnet mode" nc(1) will only attach discipline
and let user
use flow control features (like ^C).

It is not about extending a lifetime of telnet, it is about making telnet truly
optional by making it a discipline (or flow control protocol), not a separate
tool.



Re: pledge telnet

2015-11-13 Thread Alexey Suslikov
On Fri, Nov 13, 2015 at 9:00 PM, Theo de Raadt  wrote:
>> It is similar to (optional) XMODEM/ZMODEM disciplines over serial, IMO.
>
> No, it is similar to  over the INTERNET, because the INTERNET
> is nothing at all like a serial line, the later generally being nicely
> contained to a single room.
>
>> The goal is to delete classic telnet entirely and make it an
>> (optional) discipline frontend for nc(1). In "telnet mode" nc(1)
>> will only attach discipline and let user  use flow control features (like 
>> ^C).
>
> You have a diff?
>
>> It is not about extending a lifetime of telnet, it is about making telnet 
>> truly
>> optional by making it a discipline (or flow control protocol), not a separate
>> tool.
>
> If you can do it without adding *any complexity* to nc, fine.
>
> Except I know you can't do that, it will add substantial complexity.
> So this seems like a pointless discussion.  nc is already more than
> complex enough.  Probably best to focus on making it more secure,
> before making it support the stone age.

Can telnet be extended to coexist with nc -F? Manual only mentions ssh.



Re: pledge telnet

2015-11-13 Thread Alexey Suslikov
Stuart Henderson wrote:

> On 2015/11/13 09:59, Theo de Raadt wrote:
> > > > I really want to delete telnet entirely,
> > >
> > > I often use it for testing unencrypted SMTP and HTTP across the
> > > Internet.  Which tool would you recommend for that purpose?
> >
> > nc(1).
> I use telnet fairly often for connecting to things like crappy switches,
> crappy routers, APs of varying crappiness, etc. nc -t isn't close to being
> good enough for this, also with nc it's difficult to send things like ^C
> (even worse, if you use it much you forget about this and end up killing
> your connection). I wouldn't mind having it removed from base, but would
> need to go in ports unless nc gets a lot of polishing.

I always thought of telnet as a kind of discipline over the wire. There are
even extensions (like RFC 2217) well-fitting discipline model.

>From other hand, nc(1) is a "raw" tool with decent client-server model.

Is there any possibility to run nc(1) as a privsep server, and a telnet(1) as
a client, talking to nc(1) server via IMSG (instead of doing network stuff
directly)?



Re: mpsafe gem(4)

2015-10-22 Thread Alexey Suslikov
Martin Pieuchot  openbsd.org> writes:

> + /*
> +  * If we have enough room, clear IFF_OACTIVE to tell the stack
> +  * that it iss OK to send packets.
> +  */

there's a typo here. "that it iss" should be "that it is".



Re: ifdef DIAGNOSTIC in azalia.c

2015-10-13 Thread Alexey Suslikov
Alexey Suslikov  gmail.com> writes:

> 
> Alexey Suslikov  gmail.com> writes:
> 
> > If there is a need to debug something in azalia.c, defining DIAGNOSTIC
> > is overkill so replace two instances of DIAGNOSTIC with AZALIA_DEBUG
> > (DPRINTF->printf suggested by ratchov  ).
> > 
> > Also, entirely remove 3rd instance of DIAGNOSTIC. Normally it is not
> > compiled and, ratchov   thinks, related to deleted code.
> > 
> > Okays? Comments?
> 
> Is there any interest in this? Alexandre, did I correctly understood an
> output of our discussion?

ping

re-sending in case diff got missed.

--- azalia.c.orig   Wed Sep 23 16:10:19 2015
+++ azalia.cWed Sep 23 16:11:47 2015
@@ -1170,9 +1170,9 @@
uint32_t verb;
uint16_t corbwp;
 
-#ifdef DIAGNOSTIC
+#ifdef AZALIA_DEBUG
if ((AZ_READ_1(az, CORBCTL) & HDA_CORBCTL_CORBRUN) == 0) {
-   DPRINTF(("%s: CORB is not running.\n", XNAME(az)));
+   printf(("%s: CORB is not running.\n", XNAME(az)));
return(-1);
}
 #endif
@@ -1196,9 +1196,9 @@
int i;
uint16_t wp;
 
-#ifdef DIAGNOSTIC
+#ifdef AZALIA_DEBUG
if ((AZ_READ_1(az, RIRBCTL) & HDA_RIRBCTL_RIRBDMAEN) == 0) {
-   DPRINTF(("%s: RIRB is not running.\n", XNAME(az)));
+   printf(("%s: RIRB is not running.\n", XNAME(az)));
return(-1);
}
 #endif
@@ -4054,12 +4054,6 @@
/* number of blocks must be <= HDA_BDL_MAX */
az = v;
size = az->pstream.buffer.size;
-#ifdef DIAGNOSTIC
-   if (size <= 0) {
-   printf("%s: size is 0", __func__);
-   return 256;
-   }
-#endif
if (size > HDA_BDL_MAX * blk) {
blk = size / HDA_BDL_MAX;
if (blk & 0x7f)



Re: ifdef DIAGNOSTIC in azalia.c

2015-10-05 Thread Alexey Suslikov
Alexey Suslikov  gmail.com> writes:

> If there is a need to debug something in azalia.c, defining DIAGNOSTIC
> is overkill so replace two instances of DIAGNOSTIC with AZALIA_DEBUG
> (DPRINTF->printf suggested by ratchov  ).
> 
> Also, entirely remove 3rd instance of DIAGNOSTIC. Normally it is not
> compiled and, ratchov   thinks, related to deleted code.
> 
> Okays? Comments?

Is there any interest in this? Alexandre, did I correctly understood an
output of our discussion?



Re: Unlocking ix(4) a bit further

2015-09-30 Thread Alexey Suslikov
Mark Kettenis  xs4all.nl> writes:

> + * Thise parameter controls the minimum number of available transmit

"Thise" should be "This" here.



ifdef DIAGNOSTIC in azalia.c

2015-09-25 Thread Alexey Suslikov
Hi tech@.

If there is a need to debug something in azalia.c, defining DIAGNOSTIC
is overkill so replace two instances of DIAGNOSTIC with AZALIA_DEBUG
(DPRINTF->printf suggested by ratchov@).

Also, entirely remove 3rd instance of DIAGNOSTIC. Normally it is not
compiled and, ratchov@ thinks, related to deleted code.

Okays? Comments?

--- azalia.c.orig   Wed Sep 23 16:10:19 2015
+++ azalia.cWed Sep 23 16:11:47 2015
@@ -1170,9 +1170,9 @@
uint32_t verb;
uint16_t corbwp;
 
-#ifdef DIAGNOSTIC
+#ifdef AZALIA_DEBUG
if ((AZ_READ_1(az, CORBCTL) & HDA_CORBCTL_CORBRUN) == 0) {
-   DPRINTF(("%s: CORB is not running.\n", XNAME(az)));
+   printf(("%s: CORB is not running.\n", XNAME(az)));
return(-1);
}
 #endif
@@ -1196,9 +1196,9 @@
int i;
uint16_t wp;
 
-#ifdef DIAGNOSTIC
+#ifdef AZALIA_DEBUG
if ((AZ_READ_1(az, RIRBCTL) & HDA_RIRBCTL_RIRBDMAEN) == 0) {
-   DPRINTF(("%s: RIRB is not running.\n", XNAME(az)));
+   printf(("%s: RIRB is not running.\n", XNAME(az)));
return(-1);
}
 #endif
@@ -4054,12 +4054,6 @@
/* number of blocks must be <= HDA_BDL_MAX */
az = v;
size = az->pstream.buffer.size;
-#ifdef DIAGNOSTIC
-   if (size <= 0) {
-   printf("%s: size is 0", __func__);
-   return 256;
-   }
-#endif
if (size > HDA_BDL_MAX * blk) {
blk = size / HDA_BDL_MAX;
if (blk & 0x7f)



Re: Dropping needless globals (ksh)

2015-09-10 Thread Alexey Suslikov
Michael McConville  sccs.swarthmore.edu> writes:

> RCS file: /cvs/src/bin/ksh/c_ksh.c,v



> - shprintf(newline);
> + shprintf("\n");

In terms of portability, are you sure newline is \n on all platforms?



Re: Avoid grabbing the kernel lock in pool backend allocator

2015-09-05 Thread Alexey Suslikov
Mark Kettenis  xs4all.nl> writes:

> RCS file: /cvs/src/sys/kern/subr_pool.c,v



>   kd.kd_waitok = ISSET(flags, PR_WAITOK);



> + /* 
> +  * XXX Until we can call msleep(9) without holding the kernel
> +  * lock.
> +  */
> + if (ISSET(flags, PR_WAITOK))

It there a reason to re-evaluate ISSET while it is already de-normalized
into kd.kd_waitok?



Wrong man links on 58.html

2015-08-24 Thread Alexey Suslikov
In

wscons(4) works with even more odd trackpads.
Added pvbus(4) paravirtual device tree root on virtual machines that
are running on hypervisors.

http://www.openbsd.org/cgi-bin/man.cgi?query=wscons(4)sec=4
http://www.openbsd.org/cgi-bin/man.cgi?query=pvbus(4)sec=4

are wrong. Should be

http://www.openbsd.org/cgi-bin/man.cgi?query=wsconssec=4
http://www.openbsd.org/cgi-bin/man.cgi?query=pvbussec=4



size: cannot read a.out: No such file or directory

2015-08-24 Thread Alexey Suslikov
Hi tech@.

size(1) DESCRIPTION says:

... If no file is specified size attempts to report on the file a.out.

And, indeed, it warns:

$ size
size: cannot read a.out: No such file or directory

Above message looks misleading in a.out-less world.

Cheers,
Alexey



Re: [Patch] pf refactoring

2015-08-17 Thread Alexey Suslikov
Martin Pieuchot mpi at openbsd.org writes:

 On 17/08/15(Mon) 17:39, Richard Procter wrote:
  Hi, 
  
  This series of 29 small diffs slims pf.o by 2640 bytes and pf.c by
  113 non-comment lines.
 
 We generally discuss one diff per mail.  It makes it easier for people
 to comment and as you can imagine deal with possible conflicts in their
 tree :)

As far as I understood, Richard provided step-by-step refactor diffs.

What you want to discuss is a cumulative diff he referred to.



sys/arch/{hppa,hppa64}/dev/apic.c cosmetics, Was:Re: Brainy: User-Triggerable Kernel Memory Leak in execve()

2015-08-09 Thread Alexey Suslikov
Christian Schulte cs at schulte.it writes:

 _14/ UNINITIALIZED VARIABLE: sys/arch/hppa64/dev/apic.c rev1.8
   At l.176, 'cnt' is not initialized.

I came up with the following.

--- sys/arch/hppa/dev/apic.c.orig   Sun Aug  9 14:16:56 2015
+++ sys/arch/hppa/dev/apic.cSun Aug  9 14:30:47 2015
@@ -171,12 +171,11 @@
 
aiv = malloc(sizeof(struct apic_iv), M_DEVBUF, M_NOWAIT);
if (aiv == NULL) {
-   free(cnt, M_DEVBUF, 0);
-   return NULL;
+   return (NULL);
}
 
cnt = malloc(sizeof(struct evcount), M_DEVBUF, M_NOWAIT);
-   if (!cnt) {
+   if (cnt == NULL) {
free(aiv, M_DEVBUF, 0);
return (NULL);
}
--- sys/arch/hppa64/dev/apic.c.orig Sun Aug  9 14:16:47 2015
+++ sys/arch/hppa64/dev/apic.c  Sun Aug  9 14:31:14 2015
@@ -173,8 +173,7 @@
 
aiv = malloc(sizeof(struct apic_iv), M_DEVBUF, M_NOWAIT);
if (aiv == NULL) {
-   free(cnt, M_DEVBUF, 0);
-   return NULL;
+   return (NULL);
}
 
aiv-sc = sc;
@@ -185,7 +184,7 @@
aiv-cnt = NULL;
if (apic_intr_list[irq]) {
cnt = malloc(sizeof(struct evcount), M_DEVBUF, M_NOWAIT);
-   if (!cnt) {
+   if (cnt == NULL) {
free(aiv, M_DEVBUF, 0);
return (NULL);
}



Re: Brainy: User-Triggerable Kernel Memory Leak in execve()

2015-08-09 Thread Alexey Suslikov
Theo de Raadt deraadt at cvs.openbsd.org writes:

 I would like to point out the noise is coming from *users* -- not from
 actual developers in the project.

http://www.imdb.com/title/tt1278449/

you'll get the idea.



Re: Brainy: User-Triggerable Kernel Memory Leak in execve()

2015-08-08 Thread Alexey Suslikov
On Sat, Aug 8, 2015 at 2:21 PM, Christian Schulte c...@schulte.it wrote:
 Am 08/07/15 um 23:46 schrieb Alexey Suslikov:

 Christian Schulte cs at schulte.it writes:

 Now, I believe that this effort is too much for my spare time.


 Then why not release that scanner? That effort could be shared. What's
 so secret about it? You have been asked several times already.


 Start sharing right now. Brainy OpenBSD page contains info about
 lot of bugs already found. There is no secret to start writing
 diffs and pushing them.


 I was thinking about automating that process. Scan-before-commit, for
 example. Need not be that particular scanner. Some pre-commit analysis
 beyond what the compiler can warn about. How can I be sure the issues found
 by that scanner are not issues with the scanner itself?


Looks like you haven't read carefully. Quote:

Developing, improving and maintaining Brainy takes time and energy, as
well as investigating and packaging the bugs and vulnerabilities it
finds.

You already have bugs found. Next step in the process is to write diffs.



Re: Brainy: User-Triggerable Kernel Memory Leak in execve()

2015-08-07 Thread Alexey Suslikov
Christian Schulte cs at schulte.it writes:

  Now, I believe that this effort is too much for my spare time.
 
 Then why not release that scanner? That effort could be shared. What's 
 so secret about it? You have been asked several times already.

Start sharing right now. Brainy OpenBSD page contains info about
lot of bugs already found. There is no secret to start writing
diffs and pushing them.



Re: Warning on assembly of RdRand instructions

2015-08-04 Thread Alexey Suslikov
Michael McConville mmcconv1 at sccs.swarthmore.edu writes:

   https://www.hyperelliptic.org/tanja/vortraege/random.pdf

made my day:

“The way RDRAND is being used in kernels = 3.12.3 allows it to
cancel out the other entropy. See extract buf().”
“if I make RDRAND return [EDX] ^ 0x41414141, /dev/urandom
output will be all ’A’.”



Re: audio: recover after missed interrupts

2015-07-28 Thread Alexey Suslikov
Alexandre Ratchov alex at caoua.org writes:

 DPRINTFN(1, %s: rec ptr wrapped, moving %d blocs\n,
 DPRINTFN(1, %s: play ptr wrapped, moving %d blocs\n,

blocs in above DPRINTFNs should be blocks, I think.



Re: Update to /etc/services

2015-07-26 Thread Alexey Suslikov
Denis Fondras openbsd at ledeuns.net writes:

  krb524   /tcp# Kerberos 5-4

I would tweak krb524 comment to be

# Kerberos 5 to 4

because this is how krb524 reads.



Re: Brainy: User-Triggerable Kernel Memory Leak in execve()

2015-07-21 Thread Alexey Suslikov
Ville Valkonen weezelding at gmail.com writes:

 On Jul 21, 2015 9:32 AM, Maxime Villard max at m00nbsd.net wrote:
  It is not the last bug Brainy has found, but it is the last one I
  report. I don't have time for that.
 
  Maxime
 
 Why such a dramatic tone?

Because that famous thank you small people sounds more and more
ridiculous (some says Goebels'ish), no?



Re: Brainy: User-Triggerable Kernel Memory Leak in execve()

2015-07-21 Thread Alexey Suslikov
sam sam at cmpct.info writes:

 How about you release the Brainy Code Scanner then?
 
 I have so many bugs; in fact, there are so many, I don't even have the
 time to report them! My scanner is so good!
 
 Or perhaps you should report 'just' the relatively important ones?

Made my day.

Searching for bugs is for brainy. Victims of propaganda don't even
search archives.



Re: Use m_defrag in intel wireless drivers

2015-05-27 Thread Alexey Suslikov
Mark Kettenis mark.kettenis at xs4all.nl writes:

 Index: if_ipw.c
   /* too many fragments, linearize */

 Index: if_iwi.c
   /* too many fragments, linearize */

 Index: if_iwn.c
   /* Too many DMA segments, linearize mbuf. */

 Index: if_wpi.c
   /* Too many DMA segments, linearize mbuf. */

This comments can be homogenized. I'd choose iwn/wpi variant.



Tcl/Tk entry in www/57.html

2015-03-31 Thread Alexey Suslikov
Hi tech@.

Tcl/Tk 8.5.16 and 8.6.2 line (Some highlights section) appears twice:

...
Tcl/Tk 8.5.16 and 8.6.2
TeX Live 2013
Tcl/Tk 8.5.16 and 8.6.2
...



Re: ntpd:support adjusting initial time = y2k36 on 32-bit time_t platforms

2015-03-23 Thread Alexey Suslikov
Brent Cook busterb at gmail.com writes:

 + T4 += (uint64_t)tv.tv_sec + JAN_1970 + 1.0e-6 * 
tv.tv_usec;

snip

 + return ((uint64_t)tv.tv_sec + JAN_1970 + 1.0e-6 * tv.tv_usec);

snip

Can gettime_from_timeval be used over the code instead of repeating
same chunk?

T4 += gettime_from_timeval(...

return gettime_from_timeval(...



Re: Questions about 802.11n support

2015-03-05 Thread Alexey Suslikov
T. Jameson Little beatgammit at gmail.com writes:
 Well, I'm much more capable of fixing existing drivers to make it work
 well than building something from scratch, and I imagine the same is
 true for many developers, because you work on whatever affects you.

IMO, fixing existing drivers should take popularity into account.

I asked sthen@ some time ago (in early 2013) about 802.11 drivers
usage (according to dmesg logs), and he replied:

we already have information about chips from dmesglog. since may 2009:

   2 an
   2 malo
   2 urtwn
   4 atu
   4 zyd
   7 acx
   7 otus
   7 ural
  13 rsu
  13 uath
  16 ipw
  33 wi
  43 iwi
  44 run
  50 rum
  67 bwi
 105 urtw
 107 ral
 114 wpi
 171 ath
 199 athn
 547 iwn

(end of quote).

So, IMO, fixing Intel's drivers maybe be kinda preferred way to go
because of higher usage and better quality/documentation.



Re: Questions about 802.11n support

2015-03-05 Thread Alexey Suslikov
On Thu, Mar 5, 2015 at 11:45 PM, Stefan Sperling s...@stsp.name wrote:
 On Thu, Mar 05, 2015 at 09:22:51PM +, Alexey Suslikov wrote:
 T. Jameson Little beatgammit at gmail.com writes:
  Well, I'm much more capable of fixing existing drivers to make it work
  well than building something from scratch, and I imagine the same is
  true for many developers, because you work on whatever affects you.

 IMO, fixing existing drivers should take popularity into account.

 I asked sthen@ some time ago (in early 2013) about 802.11 drivers
 usage (according to dmesg logs), and he replied:

 we already have information about chips from dmesglog. since may 2009:

2 an
2 malo
2 urtwn
4 atu
4 zyd
7 acx
7 otus
7 ural
   13 rsu
   13 uath
   16 ipw
   33 wi
   43 iwi
   44 run
   50 rum
   67 bwi
  105 urtw
  107 ral
  114 wpi
  171 ath
  199 athn
  547 iwn

 (end of quote).

 So, IMO, fixing Intel's drivers maybe be kinda preferred way to go
 because of higher usage and better quality/documentation.

 This list doesn't count unsupported devices. It is skewed towards built-in
 devices, e.g. urtwn is quite common but it is at the bottom of this list.
 I think these numbers just mean that most laptop installs happen on thinkpads.

Yes. I understand. I have urtwn too, because of built-in

Ralink RT3290 rev 0x00 at pci2 dev 0 function 0 not configured

is not supported (I tried to hack on top of linux driver with no success).

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a89534edaaa7008992b878680490e9b02a665563

My point was, development should start around something widespread
so people can test easily. This maybe urtwn, iwn and iwm, for instance.



Re: Questions about 802.11n support

2015-03-05 Thread Alexey Suslikov
T. Jameson Little beatgammit at gmail.com writes:

 Since USB 2.0 has a maximum throughput of 480Mbit/s, anything higher
 than 300Mbit/s is not particularly important, and many consumer devices
 only support 150Mbit/s anyway. 72Mbit/s is completely fine for an
 initial implementation.

I slightly disagree here because newer PCIe connected chips can do more.



Re: Brainy: User-Triggerable Kernel Memory Leak

2015-01-30 Thread Alexey Suslikov
Maxime Villard max at M00nBSD.net writes:

 'lsa' being user-controllable, it is easy for a local (un)privileged
 user to cause the kernel to run out of memory and become unresponsive.
 OpenBSD 5.6/i386 is affected, and perhaps previous releases.

compat_linux(8) says:

The Linux compatibility feature is active for kernels compiled with the
COMPAT_LINUX option and kern.emul.linux sysctl(8) enabled.



Re: elantech-v4 clickpad support

2015-01-13 Thread Alexey Suslikov
Ulf Brosziewski ulf.brosziewski at t-online.de writes:

 I have written two patches that provide these options (I'm using them
 on an Acer V5-131 netbook with OpenBSD 5.6/amd64, the clickpad hardware
 and firmware is identified as Elantech Clickpad, version 4, firmware
 0x461f02). There is, however, an open question concerning wsconscomm.

Should I try on

bios0: ASUSTeK COMPUTER INC. X200CA
pms0: Elantech Clickpad, version 4, firmware 0x361f01

or it is somewhat another hardware/firmware?



Re: Kernel does not compile with option LOCKDEBUG

2015-01-11 Thread Alexey Suslikov
Philip Guenther guenther at gmail.com writes:

 It's dead, Jim, let's bury LOCKDEBUG.

There is an define AZALIA_LOG_MP and accompanying code in
sys/dev/pci/azalia.c which looks like a debug left-over.

azalia(4) is considered MP-safe for over a year from now.



PATCH: azalia(4) invalid index crash

2014-12-25 Thread Alexey Suslikov
Hi tech@.

See http://marc.info/?l=openbsd-bugsm=141867088702648w=2

Reported by t...@openmailbox.org, John M. Molloy moll...@acm.org
and confirmed this diff to fix an issue.

--- azalia.c.orig   Mon Dec 15 23:23:14 2014
+++ azalia.cWed Dec 17 13:42:41 2014
@@ -2348,14 +2348,23 @@
if (ret = 0)
return ret;
}
-   } else {
-   index = w-connections[w-selected];
-   if (VALID_WIDGET_NID(index, this)) {
-   ret = azalia_codec_find_defdac(this, index,
-   depth);
-   if (ret = 0)
-   return ret;
-   }
+   /* 7.3.3.2 Connection Select Control
+* If an attempt is made to Set an index value greater than
+* the number of list entries (index is equal to or greater
+* than the Connection List Length property for the widget)
+* the behavior is not predictable.
+*/
+
+   /* negative index values are wrong too */
+   } else if (w-selected = 0 
+   w-selected  sizeof(w-connections)) {
+   index = w-connections[w-selected];
+   if (VALID_WIDGET_NID(index, this)) {
+   ret = azalia_codec_find_defdac(this,
+   index, depth);
+   if (ret = 0)
+   return ret;
+   }
}
}
 
@@ -2393,14 +2402,23 @@
if (ret = 0)
return ret;
}
-   } else {
-   index = w-connections[w-selected];
-   if (VALID_WIDGET_NID(index, this)) {
-   ret = azalia_codec_find_defadc_sub(this, node,
-   index, depth);
-   if (ret = 0)
-   return ret;
-   }
+   /* 7.3.3.2 Connection Select Control
+* If an attempt is made to Set an index value greater than
+* the number of list entries (index is equal to or greater
+* than the Connection List Length property for the widget)
+* the behavior is not predictable.
+*/
+
+   /* negative index values are wrong too */
+   } else if (w-selected = 0 
+   w-selected  sizeof(w-connections)) {
+   index = w-connections[w-selected];
+   if (VALID_WIDGET_NID(index, this)) {
+   ret = azalia_codec_find_defadc_sub(this,
+   node, index, depth);
+   if (ret = 0)
+   return ret;
+   }
}
}
return -1;



PATCH: more of airport

2014-12-25 Thread Alexey Suslikov
Hi tech@.

Fixing existing names, plus some new.

--- airport.origWed Dec 17 14:16:04 2014
+++ airport Wed Dec 17 14:25:51 2014
@@ -328,6 +328,7 @@
 CJB:Peelamedu, Coimbatore, India
 CJS:International Abraham Gonzalez, Ciudad Juarez, Chihuahua, Mexico
 CJU:Cheju, Cheju, South Korea
+CKC:Cherkasy International, Cherkasy, Ukraine
 CKB:North Central West Virginia, Bridgeport, West Virginia, USA
 CKS:International / Brasilia Brazil, Carajas, Para, Brazil
 CKY:Conakry, Conakry, Guinea
@@ -389,7 +390,7 @@
 CVN:Clovis, New Mexico, USA
 CWA:Central Wisconsin, Wausau, Wisconsin, USA
 CWB:Afonso Pena, Curitiba, Parana, Brazil
-CWC:Chernovtsy, Ukraine
+CWC:Chernivtsi International, Chernivtsi, Ukraine
 CWT:Cowra, Cowra, New South Wales, Australia
 CXH:Vancouver Harbour, British Columbia, Canada
 CYB:Cayman Brac Island, Cayman Islands
@@ -433,11 +434,11 @@
 DMK:Don Mueang International, Bangkok, Thailand
 DMR:Dhamar, Yemen
 DND:Dundee, Scotland, United Kingdom
-DNK:Dnepropetrovsk, Ukraine
+DNK:Dnipropetrovs'k International, Dnipropetrovs'k, Ukraine
 DNM:Denham, Western Australia, Australia
 DNV:Vermilion County, Danville, Illinois, USA
 DOH:Doha, Qatar
-DOK:Donetsk, Ukraine
+DOK:Donets'k Sergey Prokofiev International, Donets'k, Ukraine
 DOL:Saint Gatien, Deauville, France
 DOM:Melville Hall, Dominica
 DPL:Dipolog, Dipolog, Philippines
@@ -694,7 +695,7 @@
 HRB:Harbin, China
 HRE:Harare, Zimbabwe
 HRG:Hurghada, Egypt
-HRK:Krarkov, Ukraine
+HRK:Kharkiv International, Kharkiv, Ukraine
 HRL:Harlingen, Texas, USA
 HRO:Boone County, Harrison, Arkansas, USA
 HSI:Hastings, Nebraska, USA
@@ -721,7 +722,8 @@
 IBZ:Ibiza, Spain
 ICT:Wichita Mid-Continent, Kansas, USA
 IDA:Idaho Falls, Idaho, USA
-IEV:Zhulhany, Kiev, Ukraine
+IEV:Kyiv Zhulyany International, Kyiv, Ukraine
+IFO:Ivano-Frankivs'k International, Ivano-Frankivs'k, Ukraine
 IFP:Bullhead City, Arizona, USA
 IGA:Inagua, Bahamas
 IGM:Mohave County, Kingman, Arizona, USA
@@ -825,7 +827,7 @@
 KAL:Kaltag, Alaska, USA
 KAN:Aminu Kano International, Nigeria
 KAT:Kaitaia, New Zealand
-KBP:Borispol, Kiev, Ukraine
+KBP:Kyiv Borispil International, Kyiv, Ukraine
 KBR:Sultan Ismail Petra, Kota Bharu, Malaysia
 KCG:Fisheries, Chignik, Alaska, USA
 KCH:Kuching, Sarawak, Malaysia
@@ -842,6 +844,7 @@
 KER:Kerman, Iran
 KGC:Kingscote, South Australia, Australia
 KGD:Kaliningrad, Russia
+KHE:Kherson International, Kherson, Ukraine
 KHH:Kaohsiung, Taiwan
 KHI:Karachi, Pakistan
 KHV:Novy, Khabarovsk, Russia
@@ -900,6 +903,7 @@
 KVB:Skovde, Sweden
 KWA:Kwajalein, Marshall Islands
 KWI:Kuwait International
+KWG:Kryvyi Rih International, Dnipropetrovs'k, Ukraine
 KWL:Guilin, China
 KZI:Kozani, Greece
 KZN:Kazan, Russia
@@ -1000,7 +1004,7 @@
 LVK:Livermore, California, USA
 LWB:Greenbrier Valley, West Virginia, USA
 LWK:Tingwall, Shetland Islands /Shetland Isd, Scotland, United Kingdom
-LWO:Snilow, Lvov, Ukraine
+LWO:Lviv Danylo Halytskyi International, Lviv, Ukraine
 LWT:Lewistown Municipal, Montana, USA
 LWY:Lawas, Sarawak, Malaysia
 LXR:Luxor, Egypt
@@ -1105,6 +1109,7 @@
 MPB:Miami Public Seaplane Base, Florida, USA
 MPL:Frejorgues, Montpellier, France
 MPM:Maputo International, Mozambique
+MPW:Mariupol International, Donets'k, Ukraine
 MQL:Mildura, Victoria, Australia
 MQN:Rossvoll, Mo I Rana, Norway
 MQP:Kruger Mpumalanga International, Nelspruit, South Africa
@@ -1167,6 +1172,7 @@
 NKG:Nanjing, China
 NLA:Ndola, Zambia
 NLD:Nuevo Laredo, Tamaulipas, Mexico
+NLV:Mykolayiv International, Mykolayiv, Ukraine
 NNG:Nanning, China
 NOC:Rep Of Ireland, Connaught, Ireland
 NOU:Tontouta, Noumea, New Caledonia
@@ -1191,7 +1197,7 @@
 OAX:Xoxocotlan, Oaxaca, Oaxaca, Mexico
 OBO:Obihiro, Japan
 ODE:Odense, Denmark
-ODS:Odessa Central, Ukraine
+ODS:Odessa International, Odessa, Ukraine
 ODW:Oak Harbor, Washington, USA
 OER:Ornskoldsvik, Sweden
 OFK:Norfolk Karl Stefan Memorial, Nebraska, USA
@@ -1238,6 +1244,7 @@
 OWB:Owensboro, Kentucky, USA
 OWD:Memorial Code: Owd, Norwood, Massachusetts, USA
 OXR:Oxnard / Ventura, California, USA
+OZH:Zaporizhya International, Zaporizhya, Ukraine
 OZZ:Ouarzazate, Morocco
 PAD:Paderborn, Germany
 PAH:Paducah, Kentucky, USA
@@ -1436,6 +1443,7 @@
 RUN:Roland Garros Airport, Reunion Island, France
 RUT:Rutland, Vermont, USA
 RWI:Wilson, Rocky Mount, North Carolina, USA
+RWN:Rivne International, Rivne, Ukraine
 SAB:Saba Island, Netherlands Antilles
 SAF:Santa Fe, New Mexico, USA
 SAH:Sanaa International, Yemen
@@ -1492,7 +1500,7 @@
 SHV:Shreveport, Louisiana, USA
 SID:Amilcar Cabral International, Sal, Cape Verde
 SIN:Changi International, Singapore
-SIP:Simferopol, Ukraine
+SIP:Simferopol International, Crimea, Ukraine
 SIR:Sion, Switzerland
 SIT:Sitka, Alaska, USA
 SJC:San Jose International, California, USA
@@ -1650,6 +1658,7 @@
 TMS:Sao Tome International, Sao Tome and Principe
 TMW:Tamworth, New South Wales, Australia
 TNG:Boukhalef Souahel, Tangier, Morocco
+TNL:Ternopil International, Ternopil, Ukraine
 TNR:Ivato, Antananarivo, Madagascar
 TOL:Toledo 

Re: Binary code patching and paravirtualization

2014-12-16 Thread Alexey Suslikov
 CVSROOT: /cvs
 Module name: src
 Changes by: s...@cvs.openbsd.org 2014/12/16 14:02:58
 Modified files:
 sys/arch/amd64/amd64: identcpu.c
 sys/arch/amd64/include: specialreg.h

 Log message:
 Define and print HV cpuid flag.
 This is set by many hypervisors, including kvm, vmware, hyper-v.

do they set HV flag only for amd64 guests? how about i386 ones?



Re: Binary code patching and paravirtualization

2014-12-11 Thread Alexey Suslikov
Stefan Fritsch sf at sfritsch.de writes:

 --- a/sys/arch/amd64/include/specialreg.h
 +++ b/sys/arch/amd64/include/specialreg.h
  at  at  -158,6 +158,7  at  at 
  #define  CPUIDECX_AVX0x1000  /* Advanced Vector Extensions 
*/
  #define  CPUIDECX_F16C   0x2000  /* 16bit fp conversion  */
  #define  CPUIDECX_RDRAND 0x4000  /* RDRAND instruction  */
 +#define  CPUIDECX_HYPERV 0x8000  /* Hypervisor present */

Is this flag standardized? Last time I have tried to push this, there
was an objection based on reserved for future use status of this flag.

See http://marc.info/?l=openbsd-bugsm=136907278229145w=2

If it is a standard nowadays, could CPUIDECX_HYPERV be committed as a
separate chunk?

Cheers,
Alexey



amd64 intro(4) refs

2014-12-09 Thread Alexey Suslikov
Hello tech@.

I noticed isapnp(4) and eisa(4) refs in amd64 intro(4) while amd64 kernel
config doesn't do neither isapnp, nor eisa.

Looks like a remnant after i386 intro(4).

Cheers,
Alexey



Re: amd64 intro(4) refs

2014-12-09 Thread Alexey Suslikov
Jason McIntyre jmc at kerhand.co.uk writes:

 On Tue, Dec 09, 2014 at 10:27:45PM +0200, Alexey Suslikov wrote:
  I noticed isapnp(4) and eisa(4) refs in amd64 intro(4) while amd64 
kernel
  config doesn't do neither isapnp, nor eisa.
 
 those pages are not MD, so they display for all archs. that's normal for
 stuff supported by more than one arch.

From what I see, intro(4) is MD.

In LIST OF DEVICES, armish, for instance, says:

iic(4)
Inter IC (I2C) bus
onewire(4)
1-Wire bus
pci(4)
introduction to PCI bus support
usb(4)
introduction to Universal Serial Bus support

while amd64 says:

cardbus(4)
introduction to CardBus support
eisa(4)
introduction to EISA bus support
iic(4)
Inter IC (I2C) bus
isa(4)
introduction to ISA bus support
isapnp(4)
introduction to ISA Plug-and-Play support
onewire(4)
1-Wire bus
pci(4)
introduction to PCI bus support
pcmcia(4)
introduction to PCMCIA (PC Card) support
usb(4)
introduction to Universal Serial Bus support



Re: LibreSSL: GOST ciphers implementation

2014-11-06 Thread Alexey Suslikov
Chris Cappuccio chris at nmedia.net writes:

 So, you're saying, he's really dmitry at svr.gov.ru, the source of 
Russian
 backdoors into technology worldwide!!!
 
 I guess the open-source ecosystem has been thoroughly poisoned!
 
 Putin is going to take us over. OpenBSD and Linux are ruined! Fuck, I'm
 switching to Windows 8.

Not enough played with RSA government backdoors, you just said you
trust another GOST (which stands for 'GOvernment STandard').



Re: LibreSSL: GOST ciphers implementation

2014-11-06 Thread Alexey Suslikov
Bob Beck beck at openbsd.org writes:

 1) It can't mess up the code base for everyone.
 2) Everyone should not need to eat the dog food

3) I try to convince myself that our grant means
a half of a cruise missile doesn't get built (c)



Re: armv7: banana pi, Allwinner A20 board

2014-10-02 Thread Alexey Suslikov
SASANO Takayoshi uaa at mx5.nisiq.net writes:

  Try this[1] kernel and have a look if it has the same issue or not.
 
 Kernel did not started... U-Boot says checksum is ok, so maybe
 .umg file is not corrupted.

When using

OpenBSD 5.6 (RAMDISK-SUNXI) #3: Sun Aug 31 18:46:49 EDT 2014

could you drop into config (pass -c to boot) and try to disable echi?



Re: run(4) firmware update; please test

2014-05-16 Thread Alexey Suslikov
Stefan Sperling stsp at openbsd.org writes:

 Are you able to try your run(4) device with FreeBSD-current (10 isn't
 new enough)? They claim to support your device and use the updated 
firmware.

Please take a look at my (unfinished) attempt to bring
MediaTek/Ralink RT5370/RT5372 support to run(4).

http://marc.info/?l=openbsd-techm=138903287819764w=2






Re: vlan tagging surgery

2014-04-21 Thread Alexey Suslikov
Henning Brauer lists-openbsdtech at bsws.de writes:

 I must admit I am getting tired of all these good proposals/ideas.
 don't you think we've gone thru this before?

Look, I haven't called them good or bad.

 what you propose would require a custom vlan_output function which
 does nothing but setting the flag and then calls ether_output.
 what exactly is won with that? except making things less obvious?
 preparing for the highly likely case that something but a vlan
 interface (as in, IFT_L2VLAN) needs to add a L2VLAN ethernet header?

(I understand you want code - not theoretical speculations).

I assume, there is an input and output of a stack.

And lot of (possible) encapsulation subsystems in the middle: vlan,
vlan-in-vlan, ipsec, you name it.

And if I understood your cksum plan right, being in the middle, given
packet doesn't know its destiny, but different subsystems may assign
tags so on the output packet may assemble itself right (by calling
necessary methods)

Given a number of subsystems, delayed processing (promise pattern
variation, actually) is way to go, imo, because stack will have
homogeneous approach for entire packet assembly logic.

In terms of above pattern, right: vlan_output will only set a flag
and call ether_output - this is what you already did with cksums.



Re: vlan tagging surgery

2014-04-21 Thread Alexey Suslikov
Henning Brauer lists-openbsdtech at bsws.de writes:

  And lot of (possible) encapsulation subsystems in the middle: vlan,
  vlan-in-vlan, ipsec, you name it.
 
 VLAN IS NOT AN ENCAPSULATION.

Well, vlan(4) says:

vlan, svlan - IEEE 802.1Q/1AD encapsulation/decapsulation pseudo-device

  Given a number of subsystems, delayed processing (promise pattern
  variation, actually) is way to go, imo, because stack will have
  homogeneous approach for entire packet assembly logic.
 
 you cannot delay this reasonably, it IS far down the road, basically
 right before sending the frame out.
 
  In terms of above pattern, right: vlan_output will only set a flag
  and call ether_output - this is what you already did with cksums.
 
 no, not even remotely. sigh.

Functionally, no, - I understand your point.

But I'm talking about *pattern* you used.

Looking at what Martin is doing, imo, you guys trying to achieve

a) concentrate all packet (re)assembly in one place to minimize
memory operations (so you need to delay some things);

b) put one lock in and one lock out (you also need to delay to be
able to put one single block of code somewhere in the output).

What I see, old (spaghetti) approach and new (delayed) approach
are trying to coexist.



WIP: MediaTek/Ralink RT5370/RT5372

2014-01-06 Thread Alexey Suslikov
This is based on

http://svnweb.freebsd.org/base?view=revisionrevision=257955

For now, my DWA-140 rev B3 is able to
* attach to run(4) and correctly identify MAC address;
* load firmware on ifconfig up;
* blink LED (so I think something is going thru a radio);
* ifconfig down (LED stops blinking).

run0 at uhub3 port 1 D-Link 11n Adapter rev 2.00/1.01 addr 6
run0: MAC/BBP RT5392 (rev 0x0222), RF RT5372 (MIMO 2T2R), address 
fc:75:16:85:ae:80

$ ifconfig run0 
run0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr fc:75:16:85:ae:80
priority: 4
groups: wlan
media: IEEE802.11 autoselect (DS1)
status: no network
ieee80211: nwid 

But ifconfig run0 scan gives nothing and adapter is unable to
associate with AP if directly specified by nwid/wpakey.

Any clue is welcome.

(The following diff requires usbdevs diff I have sent previously).


Index: sys/dev/ic/rt2860reg.h
===
RCS file: /cvs/src/sys/dev/ic/rt2860reg.h,v
retrieving revision 1.31
diff -u -p -r1.31 rt2860reg.h
--- sys/dev/ic/rt2860reg.h  26 Nov 2013 20:33:16 -  1.31
+++ sys/dev/ic/rt2860reg.h  6 Jan 2014 13:45:14 -
@@ -696,6 +696,7 @@
 
 /* possible flags for RT3020 RF register 1 */
 #define RT3070_RF_BLOCK(1  0)
+#define RT3070_PLL_PD  (1  1)
 #define RT3070_RX0_PD  (1  2)
 #define RT3070_TX0_PD  (1  3)
 #define RT3070_RX1_PD  (1  4)
@@ -747,6 +748,15 @@
 
 #define RT3090_DEF_LNA 10
 
+/* Possible flags for RT5390 RF register 3. */
+#define RT5390_VCOCAL  (1  7)
+
+/* Possible flags for RT5390 RF register 38. */
+#define RT5390_RX_LO1  (1  5)
+
+/* Possible flags for RT5390 RF register 39. */
+#define RT5390_RX_LO2  (1  7)
+
 /* RT2860 TX descriptor */
 struct rt2860_txd {
uint32_tsdp0;   /* Segment Data Pointer 0 */
@@ -880,17 +890,19 @@ struct rt2860_rxwi {
 #define RT2860_RF3 1
 #define RT2860_RF4 3
 
-#define RT2860_RF_2820 1   /* 2T3R */
-#define RT2860_RF_2850 2   /* dual-band 2T3R */
-#define RT2860_RF_2720 3   /* 1T2R */
-#define RT2860_RF_2750 4   /* dual-band 1T2R */
-#define RT3070_RF_3020 5   /* 1T1R */
-#define RT3070_RF_2020 6   /* b/g */
-#define RT3070_RF_3021 7   /* 1T2R */
-#define RT3070_RF_3022 8   /* 2T2R */
-#define RT3070_RF_3052 9   /* dual-band 2T2R */
-#define RT3070_RF_3320 11  /* 1T1R */
-#define RT3070_RF_3053 13  /* dual-band 3T3R */
+#define RT2860_RF_2820 0x0001  /* 2T3R */
+#define RT2860_RF_2850 0x0002  /* dual-band 2T3R */
+#define RT2860_RF_2720 0x0003  /* 1T2R */
+#define RT2860_RF_2750 0x0004  /* dual-band 1T2R */
+#define RT3070_RF_3020 0x0005  /* 1T1R */
+#define RT3070_RF_2020 0x0006  /* b/g */
+#define RT3070_RF_3021 0x0007  /* 1T2R */
+#define RT3070_RF_3022 0x0008  /* 2T2R */
+#define RT3070_RF_3052 0x0009  /* dual-band 2T2R */
+#define RT3070_RF_3320 0x000b  /* 1T1R */
+#define RT3070_RF_3053 0x000d  /* dual-band 3T3R */
+#define RT5390_RF_5370 0x5370  /* 1T1R */
+#define RT5390_RF_5372 0x5372  /* 2T2R */
 
 /* USB commands for RT2870 only */
 #define RT2870_RESET   1
@@ -1084,63 +1096,94 @@ static const struct rt2860_rate {
{ 105, 0x05 },  \
{ 106, 0x35 }
 
+#define RT5390_DEF_BBP \
+   {  31, 0x08 },  \
+   {  65, 0x2c },  \
+   {  66, 0x38 },  \
+   {  68, 0x0b },  \
+   {  69, 0x0d },  \
+   {  70, 0x06 },  \
+   {  73, 0x13 },  \
+   {  75, 0x46 },  \
+   {  76, 0x28 },  \
+   {  77, 0x59 },  \
+   {  81, 0x37 },  \
+   {  82, 0x62 },  \
+   {  83, 0x7a },  \
+   {  84, 0x9a },  \
+   {  86, 0x38 },  \
+   {  91, 0x04 },  \
+   {  92, 0x02 },  \
+   { 103, 0xc0 },  \
+   { 104, 0x92 },  \
+   { 105, 0x3c },  \
+   { 106, 0x03 },  \
+   { 128, 0x12 }
+
 /*
  * Default settings for RF registers; values derived from the reference driver.
  */
-#define RT2860_RF2850  \
-   {   1, 0x100bb3, 0x1301e1, 0x05a014, 0x001402 },\
-   {   2, 0x100bb3, 0x1301e1, 0x05a014, 0x001407 },\
-   {   3, 0x100bb3, 0x1301e2, 0x05a014, 0x001402 },\
-   {   4, 0x100bb3, 0x1301e2, 0x05a014, 0x001407 },\
-   {   5, 0x100bb3, 0x1301e3, 0x05a014, 0x001402 },\
-   {   6, 0x100bb3, 0x1301e3, 0x05a014, 0x001407 },\
-   {   7, 0x100bb3, 0x1301e4, 0x05a014, 0x001402 },\
-   {   8, 0x100bb3, 0x1301e4, 0x05a014, 0x001407 },\
-   {   9, 0x100bb3, 0x1301e5, 0x05a014, 0x001402 },\
-   {  10, 0x100bb3, 0x1301e5, 0x05a014, 0x001407 },\
-   {  11, 0x100bb3, 0x1301e6, 0x05a014, 0x001402 },\
-   {  12, 0x100bb3, 0x1301e6, 0x05a014, 0x001407 },\
-   {  13, 0x100bb3, 0x1301e7, 0x05a014, 0x001402 },\
-   {  14, 0x100bb3, 0x1301e8, 0x05a014, 0x001404 },\
-   {  36, 0x100bb3, 0x130266, 0x056014, 0x001408 

Randomization from the bootblocks

2014-01-02 Thread Alexey Suslikov
Theo de Raadt deraadt at cvs.openbsd.org writes:

 This requires an upgrade of the bootblocks and at least
 /etc/rc (which saves an entropy file for future use).  Some
 bootblocks will be able to use machine-dependent features
 to improve the entropy even further (for instance using
 random instructions or fast-running counters or such).

 As a result, the kernel can start using arc4random()
 exceedingly early on, even before interrupt entropy is
 collected.  The randomization subsystem can hopefully
 become simpler due to this early entropy.. there is more
 work do here.

I have a question.

Having no interrupt (and such) entropy means less entropy.

From other hand, there are lot of speculations about some
hardware entropy sources are suspected (proven?) bad (or
intentionally hijacked?).

So question here is, does moving random generation closer
to hardware paves a way to more predictable numbers?

Cheers,
Alexey



pfsync(4) mangles prio in master/slave setup

2013-11-20 Thread Alexey Suslikov
Hi.

This is on 5.4-stable. Trivial master/slave carp(4) setup. vlan(4) is to
make picture clear wrt prio.

Test 1 (without using match).

pf.conf (BOX1 and BOX2).

ext_if=vlan101
dmz_if=vlan10
pf_sync=vlan50
block log all
pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7
pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync)
pass quick on $dmz_if inet proto icmp all icmp-type echoreq set prio 5
pass quick on $dmz_if
pass out quick on $ext_if inet proto icmp all icmp-type echoreq set prio 5
pass out quick on $ext_if

BOX1 is Master, BOX2 is Slave.

BOX1:
00:07:36.108948 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo request
00:07:36.109281 802.1Q vid 101 pri 5 X.X.185.145  X.X.36.14: icmp: echo request
00:07:36.110013 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo reply
00:07:36.110030 802.1Q vid 10 pri 5 X.X.36.14  X.X.185.145: icmp: echo reply

BOX1 is Slave, BOX2 is Master.

BOX2:
00:12:43.981979 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo request
00:12:43.982013 802.1Q vid 101 pri 0 X.X.185.145  X.X.36.14: icmp: echo request
00:12:43.982693 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo reply
00:12:43.982713 802.1Q vid 10 pri 0 X.X.36.14  X.X.185.145: icmp: echo reply

Test 2 (using match).

pf.conf (BOX1 and BOX2).

ext_if=vlan101
dmz_if=vlan10
pf_sync=vlan50
block log all
match quick on { $ext_if, $dmz_if } inet proto icmp all icmp-type
echoreq set prio 5
pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7
pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync)
pass quick on $dmz_if
pass out quick on $ext_if

BOX1 is Master, BOX2 is Slave.

BOX1:
00:27:47.442820 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo request
00:27:47.442839 802.1Q vid 101 pri 5 X.X.185.145  X.X.36.14: icmp: echo request
00:27:48.468709 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo reply
00:27:47.443523 802.1Q vid 10 pri 5 X.X.36.14  X.X.185.145: icmp: echo reply

BOX1 is Slave, BOX2 is Master.

BOX2:
00:30:35.317329 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo request
00:30:35.317354 802.1Q vid 101 pri 0 X.X.185.145  X.X.36.14: icmp: echo request
00:30:35.318065 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo reply
00:30:35.318084 802.1Q vid 10 pri 0 X.X.36.14  X.X.185.145: icmp: echo reply

Maybe ICMP is not a sort of traffic which makes difference, but think
about TCP ACKs are prioritized. Switching to Slave in production setup
makes things *REALLY* bad.

Should I configure something, or this is an issue?

(Speaking of pfsync code, I'm unable to find where prio is set inside
pfsync_state_import).

Thanks,
Alexey



Re: pfsync(4) mangles prio in master/slave setup

2013-11-20 Thread Alexey Suslikov
On Wed, Nov 20, 2013 at 1:32 PM, Mike Belopuhov m...@belopuhov.com wrote:
 could you please add more description to this report since
 it's very hard to follow and interpret your mail.

basically, when setup switches to slave, packets (matching
given state) have wrong prio set (wrong means they were
right when state was created on master).

I will be glad to provide more information/tests/etc - just say
what is needed.


 On 20 November 2013 12:11, Alexey Suslikov alexey.susli...@gmail.com wrote:
 Hi.

 This is on 5.4-stable. Trivial master/slave carp(4) setup. vlan(4) is to
 make picture clear wrt prio.

 Test 1 (without using match).

 pf.conf (BOX1 and BOX2).

 ext_if=vlan101
 dmz_if=vlan10
 pf_sync=vlan50
 block log all
 pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7
 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync)
 pass quick on $dmz_if inet proto icmp all icmp-type echoreq set prio 5
 pass quick on $dmz_if
 pass out quick on $ext_if inet proto icmp all icmp-type echoreq set prio 5
 pass out quick on $ext_if

 BOX1 is Master, BOX2 is Slave.

 BOX1:
 00:07:36.108948 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:07:36.109281 802.1Q vid 101 pri 5 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:07:36.110013 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply
 00:07:36.110030 802.1Q vid 10 pri 5 X.X.36.14  X.X.185.145: icmp: echo reply

 BOX1 is Slave, BOX2 is Master.

 BOX2:
 00:12:43.981979 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:12:43.982013 802.1Q vid 101 pri 0 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:12:43.982693 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply
 00:12:43.982713 802.1Q vid 10 pri 0 X.X.36.14  X.X.185.145: icmp: echo reply

 Test 2 (using match).

 pf.conf (BOX1 and BOX2).

 ext_if=vlan101
 dmz_if=vlan10
 pf_sync=vlan50
 block log all
 match quick on { $ext_if, $dmz_if } inet proto icmp all icmp-type
 echoreq set prio 5
 pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7
 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync)
 pass quick on $dmz_if
 pass out quick on $ext_if

 BOX1 is Master, BOX2 is Slave.

 BOX1:
 00:27:47.442820 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:27:47.442839 802.1Q vid 101 pri 5 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:27:48.468709 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply
 00:27:47.443523 802.1Q vid 10 pri 5 X.X.36.14  X.X.185.145: icmp: echo reply

 BOX1 is Slave, BOX2 is Master.

 BOX2:
 00:30:35.317329 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:30:35.317354 802.1Q vid 101 pri 0 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:30:35.318065 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply
 00:30:35.318084 802.1Q vid 10 pri 0 X.X.36.14  X.X.185.145: icmp: echo reply

 Maybe ICMP is not a sort of traffic which makes difference, but think
 about TCP ACKs are prioritized. Switching to Slave in production setup
 makes things *REALLY* bad.

 Should I configure something, or this is an issue?

 (Speaking of pfsync code, I'm unable to find where prio is set inside
 pfsync_state_import).

 Thanks,
 Alexey




Re: pfsync(4) mangles prio in master/slave setup

2013-11-20 Thread Alexey Suslikov
On Wed, Nov 20, 2013 at 1:38 PM, Alexey Suslikov
alexey.susli...@gmail.com wrote:
 On Wed, Nov 20, 2013 at 1:32 PM, Mike Belopuhov m...@belopuhov.com wrote:
 could you please add more description to this report since
 it's very hard to follow and interpret your mail.

 basically, when setup switches to slave, packets (matching
 given state) have wrong prio set (wrong means they were
 right when state was created on master).

 I will be glad to provide more information/tests/etc - just say
 what is needed.


 On 20 November 2013 12:11, Alexey Suslikov alexey.susli...@gmail.com wrote:
 Hi.

 This is on 5.4-stable. Trivial master/slave carp(4) setup. vlan(4) is to
 make picture clear wrt prio.

 Test 1 (without using match).

 pf.conf (BOX1 and BOX2).

 ext_if=vlan101
 dmz_if=vlan10
 pf_sync=vlan50
 block log all
 pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7
 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync)
 pass quick on $dmz_if inet proto icmp all icmp-type echoreq set prio 5
 pass quick on $dmz_if
 pass out quick on $ext_if inet proto icmp all icmp-type echoreq set prio 5
 pass out quick on $ext_if

 BOX1 is Master, BOX2 is Slave.

 BOX1:
 00:07:36.108948 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:07:36.109281 802.1Q vid 101 pri 5 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:07:36.110013 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply
 00:07:36.110030 802.1Q vid 10 pri 5 X.X.36.14  X.X.185.145: icmp: echo 
 reply

 BOX1 is Slave, BOX2 is Master.

 BOX2:
 00:12:43.981979 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:12:43.982013 802.1Q vid 101 pri 0 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:12:43.982693 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply
 00:12:43.982713 802.1Q vid 10 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply

While on Slave, having all zeroes prio in output path (echo request in
vlan101 and reply in vlan10), imo, indicates a state being crafted by
pfsync_state_import without a prio took in account.

In contrast to set_tos, min_ttl and such, pf_state_export too isn't doing
anything about prio, so I think prio neither exported to pfsync packet,
nor imported from.


 Test 2 (using match).

 pf.conf (BOX1 and BOX2).

 ext_if=vlan101
 dmz_if=vlan10
 pf_sync=vlan50
 block log all
 match quick on { $ext_if, $dmz_if } inet proto icmp all icmp-type
 echoreq set prio 5
 pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7
 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync)
 pass quick on $dmz_if
 pass out quick on $ext_if

 BOX1 is Master, BOX2 is Slave.

 BOX1:
 00:27:47.442820 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:27:47.442839 802.1Q vid 101 pri 5 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:27:48.468709 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply
 00:27:47.443523 802.1Q vid 10 pri 5 X.X.36.14  X.X.185.145: icmp: echo 
 reply

 BOX1 is Slave, BOX2 is Master.

 BOX2:
 00:30:35.317329 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:30:35.317354 802.1Q vid 101 pri 0 X.X.185.145  X.X.36.14: icmp: echo 
 request
 00:30:35.318065 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply
 00:30:35.318084 802.1Q vid 10 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
 reply

 Maybe ICMP is not a sort of traffic which makes difference, but think
 about TCP ACKs are prioritized. Switching to Slave in production setup
 makes things *REALLY* bad.

 Should I configure something, or this is an issue?

 (Speaking of pfsync code, I'm unable to find where prio is set inside
 pfsync_state_import).

 Thanks,
 Alexey




Re: pfsync(4) mangles prio in master/slave setup

2013-11-20 Thread Alexey Suslikov
On Wed, Nov 20, 2013 at 2:15 PM, Florian Obser flor...@openbsd.org wrote:
 On Wed, Nov 20, 2013 at 01:38:11PM +0200, Alexey Suslikov wrote:
 On Wed, Nov 20, 2013 at 1:32 PM, Mike Belopuhov m...@belopuhov.com wrote:
  could you please add more description to this report since
  it's very hard to follow and interpret your mail.

 basically, when setup switches to slave, packets (matching
 given state) have wrong prio set (wrong means they were
 right when state was created on master).

 I will be glad to provide more information/tests/etc - just say
 what is needed.

 Do you have the same ruleset checksum on both machines? check with
 pfctl -vs info | fgrep Checksum

yes. checksums are same.



 
  On 20 November 2013 12:11, Alexey Suslikov alexey.susli...@gmail.com 
  wrote:
  Hi.
 
  This is on 5.4-stable. Trivial master/slave carp(4) setup. vlan(4) is to
  make picture clear wrt prio.
 
  Test 1 (without using match).
 
  pf.conf (BOX1 and BOX2).
 
  ext_if=vlan101
  dmz_if=vlan10
  pf_sync=vlan50
  block log all
  pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7
  pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync)
  pass quick on $dmz_if inet proto icmp all icmp-type echoreq set prio 5
  pass quick on $dmz_if
  pass out quick on $ext_if inet proto icmp all icmp-type echoreq set prio 5
  pass out quick on $ext_if
 
  BOX1 is Master, BOX2 is Slave.
 
  BOX1:
  00:07:36.108948 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
  request
  00:07:36.109281 802.1Q vid 101 pri 5 X.X.185.145  X.X.36.14: icmp: echo 
  request
  00:07:36.110013 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
  reply
  00:07:36.110030 802.1Q vid 10 pri 5 X.X.36.14  X.X.185.145: icmp: echo 
  reply
 
  BOX1 is Slave, BOX2 is Master.
 
  BOX2:
  00:12:43.981979 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
  request
  00:12:43.982013 802.1Q vid 101 pri 0 X.X.185.145  X.X.36.14: icmp: echo 
  request
  00:12:43.982693 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
  reply
  00:12:43.982713 802.1Q vid 10 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
  reply
 
  Test 2 (using match).
 
  pf.conf (BOX1 and BOX2).
 
  ext_if=vlan101
  dmz_if=vlan10
  pf_sync=vlan50
  block log all
  match quick on { $ext_if, $dmz_if } inet proto icmp all icmp-type
  echoreq set prio 5
  pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7
  pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync)
  pass quick on $dmz_if
  pass out quick on $ext_if
 
  BOX1 is Master, BOX2 is Slave.
 
  BOX1:
  00:27:47.442820 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
  request
  00:27:47.442839 802.1Q vid 101 pri 5 X.X.185.145  X.X.36.14: icmp: echo 
  request
  00:27:48.468709 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
  reply
  00:27:47.443523 802.1Q vid 10 pri 5 X.X.36.14  X.X.185.145: icmp: echo 
  reply
 
  BOX1 is Slave, BOX2 is Master.
 
  BOX2:
  00:30:35.317329 802.1Q vid 10 pri 3 X.X.185.145  X.X.36.14: icmp: echo 
  request
  00:30:35.317354 802.1Q vid 101 pri 0 X.X.185.145  X.X.36.14: icmp: echo 
  request
  00:30:35.318065 802.1Q vid 101 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
  reply
  00:30:35.318084 802.1Q vid 10 pri 0 X.X.36.14  X.X.185.145: icmp: echo 
  reply
 
  Maybe ICMP is not a sort of traffic which makes difference, but think
  about TCP ACKs are prioritized. Switching to Slave in production setup
  makes things *REALLY* bad.
 
  Should I configure something, or this is an issue?
 
  (Speaking of pfsync code, I'm unable to find where prio is set inside
  pfsync_state_import).
 
  Thanks,
  Alexey
 


 --
 I'm not entirely sure you are real.



Re: Unexpected match set prio behaviour

2013-11-18 Thread Alexey Suslikov
On Mon, Nov 18, 2013 at 3:03 AM, Alexander Bluhm
alexander.bl...@gmx.net wrote:
 On Thu, Nov 14, 2013 at 12:03:21AM +0200, Alexey Suslikov wrote:
 This is on 5.4-stable. vlan is only used to see what resulting prio is.

 #match on { $int_if } inet proto icmp all icmp-type echoreq set prio 5
 pass quick on { $ext_if, $int_if }

 Can you test wether this diff matches your expected behaviour?
 Please try various combinations of pass and match rules.

 bluhm

 Index: net/pf.c
 ===
 RCS file: /data/mirror/openbsd/cvs/src/sys/net/pf.c,v
 retrieving revision 1.861
 diff -u -p -r1.861 pf.c
 --- net/pf.c16 Nov 2013 00:36:01 -  1.861
 +++ net/pf.c18 Nov 2013 00:56:55 -
 @@ -3110,8 +3110,10 @@ pf_rule_to_actions(struct pf_rule *r, st
 a-max_mss = r-max_mss;
 a-flags |= (r-scrub_flags  (PFSTATE_NODF|PFSTATE_RANDOMID|
 PFSTATE_SETTOS|PFSTATE_SCRUB_TCP|PFSTATE_SETPRIO));
 -   a-set_prio[0] = r-set_prio[0];
 -   a-set_prio[1] = r-set_prio[1];
 +   if (r-scrub_flags  PFSTATE_SETPRIO) {
 +   a-set_prio[0] = r-set_prio[0];
 +   a-set_prio[1] = r-set_prio[1];
 +   }
  }

  #define PF_TEST_ATTRIB(t, a)   \

well, it seems like now I have expected results. at least for following
test cases. please tell if you need more.

for a record, issue in question was discovered by Roman Kravchuk,
I just assisted with analysis and reporting.

Test 1 (default prio):

# cat /etc/pf.conf
ext_if=em0
int_if=vlan2525
set skip on { lo enc0 em1 }
block log all
#match on { $int_if } inet proto icmp all icmp-type echoreq set prio 6
#match on { $int_if } inet proto udp to port domain set prio 5
#match on { $int_if } inet proto tcp set prio (2, 4)
pass quick on { $ext_if, $int_if }

ICMP
12:45:57.293179 802.1Q vid 2525 pri 3 192.168.100.1  192.168.100.2:
icmp: echo request
12:45:57.293491 802.1Q vid 2525 pri 3 192.168.100.2  192.168.100.1:
icmp: echo reply

TCP
12:46:39.953468 802.1Q vid 2525 pri 3 192.168.100.1.17637 
192.168.100.2.80: S 370622106:370622106(0) win 16384 mss
1460,nop,nop,sackOK,nop,wscale 3,nop,nop,timestamp 1183962946 0 (DF)
12:46:39.953944 802.1Q vid 2525 pri 3 192.168.100.2.80 
192.168.100.1.17637: S 3464733189:3464733189(0) ack 370622107 win
16384 mss 1460,nop,nop,sackOK,nop,wscale 3,nop,nop,timestamp
448817884 1183962946 (DF)
12:46:39.954024 802.1Q vid 2525 pri 3 192.168.100.1.17637 
192.168.100.2.80: . ack 1 win 2048 nop,nop,timestamp 1183962946
448817884 (DF)
12:46:39.963421 802.1Q vid 2525 pri 3 192.168.100.1.17637 
192.168.100.2.80: P 1:230(229) ack 1 win 2048 nop,nop,timestamp
1183962946 448817884 (DF)
12:46:39.970068 802.1Q vid 2525 pri 3 192.168.100.2.80 
192.168.100.1.17637: . 1:1449(1448) ack 230 win 2172
nop,nop,timestamp 448817884 1183962946 (DF)
12:46:39.970095 802.1Q vid 2525 pri 3 192.168.100.2.80 
192.168.100.1.17637: P 1449:2516(1067) ack 230 win 2172
nop,nop,timestamp 448817884 1183962946 (DF)
12:46:39.970172 802.1Q vid 2525 pri 3 192.168.100.1.17637 
192.168.100.2.80: . ack 2516 win 1733 nop,nop,timestamp 1183962946
448817884 (DF)
12:46:39.970214 802.1Q vid 2525 pri 3 192.168.100.2.80 
192.168.100.1.17637: F 2516:2516(0) ack 230 win 2172
nop,nop,timestamp 448817884 1183962946 (DF)
12:46:39.970280 802.1Q vid 2525 pri 3 192.168.100.1.17637 
192.168.100.2.80: . ack 2517 win 1733 nop,nop,timestamp 1183962946
448817884 (DF)
12:46:39.993600 802.1Q vid 2525 pri 3 192.168.100.1.17637 
192.168.100.2.80: F 230:230(0) ack 2517 win 2048 nop,nop,timestamp
1183962946 448817884 (DF)
12:46:39.993927 802.1Q vid 2525 pri 3 192.168.100.2.80 
192.168.100.1.17637: . ack 231 win 2172 nop,nop,timestamp 448817884
1183962946 (DF)

UDP
12:47:58.298665 802.1Q vid 2525 pri 3 192.168.100.1.39295 
192.168.100.2.53: 36561+ A? i.ua. (22)
12:47:58.552804 802.1Q vid 2525 pri 3 192.168.100.2.53 
192.168.100.1.39295: 36561 1/2/0 A 91.198.36.14 (74)

Test 2 (match takes care of prio):

# cat /etc/pf.conf
ext_if=em0
int_if=vlan2525
set skip on { lo enc0 em1 }
block log all
match on { $int_if } inet proto icmp all icmp-type echoreq set prio 6
match on { $int_if } inet proto udp to port domain set prio 5
match on { $int_if } inet proto tcp set prio (2, 4)
pass quick on { $ext_if, $int_if }

ICMP
12:52:44.783107 802.1Q vid 2525 pri 6 192.168.100.1  192.168.100.2:
icmp: echo request
12:52:44.783516 802.1Q vid 2525 pri 6 192.168.100.2  192.168.100.1:
icmp: echo reply

TCP
12:53:28.007629 802.1Q vid 2525 pri 2 192.168.100.1.49012 
192.168.100.2.80: S 2694025614:2694025614(0) win 16384 mss
1460,nop,nop,sackOK,nop,wscale 3,nop,nop,timestamp 80976101 0 (DF)
12:53:28.007915 802.1Q vid 2525 pri 3 192.168.100.2.80 
192.168.100.1.49012: S 704605823:704605823(0) ack 2694025615 win 16384
mss 1460,nop,nop,sackOK,nop,wscale 3,nop,nop,timestamp 281624921
80976101 (DF)
12:53:28.007990 802.1Q vid 2525 pri 4 192.168.100.1.49012 
192.168.100.2.80: . ack 1 win 2048 nop,nop,timestamp 80976101

Unexpected match set prio behaviour

2013-11-13 Thread Alexey Suslikov
Hi tech@.

This is on 5.4-stable. vlan is only used to see what resulting prio is.

The ruleset:
---
ext_if=em0
int_if=vlan2525
set skip on { lo enc0 em1 }
block log all
#match on { $int_if } inet proto icmp all icmp-type echoreq set prio 5
pass quick on { $ext_if, $int_if }
---

The vlan:
---
vlan2525: flags=28843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,NOINET6 mtu 1500
lladdr 00:1a:4a:a8:0a:8c
description: LAN
priority: 0
vlan: 2525 parent interface: em1
groups: vlan
status: active
inet 192.168.100.1 netmask 0xff00 broadcast 192.168.100.255
---

Pinging 192.168.100.2 (which is behind vlan2525) gives expected result:

23:51:02.154928 802.1Q vid 2525 pri 3 192.168.100.1  192.168.100.2:
icmp: echo request
23:51:02.155313 802.1Q vid 2525 pri 3 192.168.100.2  192.168.100.1:
icmp: echo reply

prio is set to 3 according to documentation.

Now, after I uncomment match rule and ping 192.168.100.2, the result is:

23:54:02.865267 802.1Q vid 2525 pri 0 192.168.100.1  192.168.100.2:
icmp: echo request
23:54:02.865485 802.1Q vid 2525 pri 0 192.168.100.2  192.168.100.1:
icmp: echo reply

prio 0 is somewhat unexpected.

Am I doing something wrong?

Cheers,
Alexey



Re: 2 x em(4) and 2 x bnx(4) trunk(4)s behave differently

2013-11-11 Thread Alexey Suslikov
On Mon, Nov 11, 2013 at 11:42 AM, Stuart Henderson s...@spacehopper.org wrote:
 master on em0/em1/bnx0 is nothing to do with trunk, it is about the gigabit 
 ethernet clocking source.

ok, but it is obvious: documentation is unclear (silent) about that.


 lacp hashing policy is the same as for loadbalance, see the manpage and 
 confirm in trunk_hashmbuf().

I see different inbound packet distribution on trunk on-top of em(4)s
and on trunk on top of bnx(4)s -
that's the real problem.



Re: 2 x em(4) and 2 x bnx(4) trunk(4)s behave differently

2013-11-11 Thread Alexey Suslikov
On Mon, Nov 11, 2013 at 12:19 PM, Janne Johansson icepic...@gmail.com wrote:
 I'm not sure if I am misunderstanding your direction of inbound, but that
 would be an effect of what the switch does, would it not?
 If the switch isn't configured for LACP correctly, then it would send the
 traffic to one of them, only.

again, consider the following output

IFACE STATE DESC IPKTS IBYTES  IERRS  OPKTS OBYTES  OERRS  COLLS
bnx0  up:U2873  2956K  0  2977  0  0
bnx1  up:U   5360  0   3119  2604K  0  0
trunk0up:U2878  2956K  0   3121  2605K  0  0

(inbound is distributed via single interface, outbound - via 2nd
interface in trunk)

IFACE STATE DESC IPKTS IBYTES  IERRS  OPKTS OBYTES  OERRS  COLLS
em0   up:U2711  2859K  0   5593  5222K  0  0
em1   up:U2867  2343K  0 10   3226  0  0
trunk0up:U5578  5202K  0   5603  5225K  0  0

(inbound is distributed via both interfaces, outbound - via 1st
interface in trunk)

I'm less worried about outbound, however it is interesting why em(4) setup uses
first interface, but bnx(4) setup uses second. by 1st and 2nd I
mean an order
of addition inside hostname.if

$ cat /etc/hostname.trunk0
trunkproto lacp trunkport bnx0 trunkport bnx1
up
-inet6

$ cat /etc/hostname.trunk0
trunkproto lacp trunkport em0 trunkport em1
up
-inet6

on switch itself, both trunks have no visible difference in configuration.




 2013/11/11 Alexey Suslikov alexey.susli...@gmail.com

 On Mon, Nov 11, 2013 at 11:42 AM, Stuart Henderson s...@spacehopper.org
 wrote:
  master on em0/em1/bnx0 is nothing to do with trunk, it is about the
  gigabit ethernet clocking source.

 ok, but it is obvious: documentation is unclear (silent) about that.

 
  lacp hashing policy is the same as for loadbalance, see the manpage and
  confirm in trunk_hashmbuf().

 I see different inbound packet distribution on trunk on-top of em(4)s
 and on trunk on top of bnx(4)s -
 that's the real problem.




 --
 May the most significant bit of your life be positive.



Re: 2 x em(4) and 2 x bnx(4) trunk(4)s behave differently

2013-11-11 Thread Alexey Suslikov
On Mon, Nov 11, 2013 at 12:43 PM, Stuart Henderson st...@openbsd.org wrote:
 On 2013/11/11 12:15, Alexey Suslikov wrote:
 On Mon, Nov 11, 2013 at 11:42 AM, Stuart Henderson s...@spacehopper.org 
 wrote:
  master on em0/em1/bnx0 is nothing to do with trunk, it is about the 
  gigabit ethernet clocking source.

 ok, but it is obvious: documentation is unclear (silent) about that.

 Why would something listed as a media characteristic of the physical
 interface have anything to do with trunk?

well, I just expected to see master media option documented somewhere,
to make it clear what is trunk master and what is clocking master.


  lacp hashing policy is the same as for loadbalance, see the manpage and 
  confirm in trunk_hashmbuf().

 I see different inbound packet distribution on trunk on-top of em(4)s
 and on trunk on top of bnx(4)s -
 that's the real problem.

 The trunk driver can't influence inbound packet distribution, that is
 down to the device sending packets e.g. your switch..


yes, I know. but bnx(4) interfaces have master set differently, in contrast
to em(4) interfaces. I'm really guessing, but maybe that clocking source
has some effect for a switch.



Re: 2 x em(4) and 2 x bnx(4) trunk(4)s behave differently

2013-11-11 Thread Alexey Suslikov
On Mon, Nov 11, 2013 at 1:00 PM, Stuart Henderson st...@openbsd.org wrote:
 On 2013/11/11 12:15, Alexey Suslikov wrote:
  I see different inbound packet distribution on trunk on-top of em(4)s
  and on trunk on top of bnx(4)s -
  that's the real problem.

 On 2013/11/11 10:43, I wrote:
 The trunk driver can't influence inbound packet distribution, that is
 down to the device sending packets e.g. your switch..

 ... for newer HP L3 switches, you might want to look at
 trunk-load-balance L4-based, for ciscos port-channel load-balance..

yes, I'm aware of above options, but I have SPS2024-G5 in this setup.

did some tests with mode servers involved, and outbound is no worry.
this is how trunk(4) hashing works.

IFACE STATE DESC IPKTS IBYTES  IERRS  OPKTS OBYTES  OERRS  COLLS
bnx0  up:U 487 237275  0129  41107  0  0
bnx1  up:U   5360  0348  65383  0  0

IFACE STATE DESC IPKTS IBYTES  IERRS  OPKTS OBYTES  OERRS  COLLS
em0   up:U 228  54112  0136  51470  0  0
em1   up:U 218  65348  0322  79837  0  0

but bnx1 inbound is always showing 4-5 packets no matter how traffic
is distributed :/



2 x em(4) and 2 x bnx(4) trunk(4)s behave differently

2013-11-10 Thread Alexey Suslikov
Hi tech@.

Two machines (A and B) running recent 5.4-stable plugged into same switch.

A has:

em0 at pci4 dev 0 function 0 Intel 82573E rev 0x03: msi, address
00:30:48:66:a0:ec
em1 at pci5 dev 0 function 0 Intel 82573L rev 0x00: msi, address
00:30:48:66:a0:ed

B has:

bnx0 at pci2 dev 0 function 0 Broadcom BCM5716 rev 0x20: apic 0 int 16
bnx1 at pci2 dev 0 function 1 Broadcom BCM5716 rev 0x20: apic 0 int 17

bnx0: address b8:ac:6f:91:48:da
brgphy0 at bnx0 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8
bnx1: address b8:ac:6f:91:48:db
brgphy1 at bnx1 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8

Both servers have LACP trunk(4)s built on-top the above mentioned interfaces:

A has:

em0: 
flags=28b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST,NOINET6
mtu 1500
lladdr 00:30:48:66:a0:ec
priority: 0
trunk: trunkdev trunk0
media: Ethernet autoselect (1000baseT full-duplex,master)
status: active
em1: 
flags=28b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST,NOINET6
mtu 1500
lladdr 00:30:48:66:a0:ec
priority: 0
trunk: trunkdev trunk0
media: Ethernet autoselect (1000baseT full-duplex,master)
status: active
trunk0: flags=28943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,NOINET6
mtu 1500
lladdr 00:30:48:66:a0:ec
priority: 0
trunk: trunkproto lacp
trunk id: [(8000,00:30:48:66:a0:ec,402C,,),
 (0001,ec:30:91:25:c0:4f,03E8,,)]
trunkport em1 active,collecting,distributing
trunkport em0 active,collecting,distributing
groups: trunk
media: Ethernet autoselect
status: active

B has:

bnx0: 
flags=28b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST,NOINET6
mtu 1500
lladdr b8:ac:6f:91:48:da
priority: 0
trunk: trunkdev trunk0
media: Ethernet autoselect (1000baseT full-duplex)
status: active
bnx1: 
flags=28b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST,NOINET6
mtu 1500
lladdr b8:ac:6f:91:48:da
priority: 0
trunk: trunkdev trunk0
media: Ethernet autoselect (1000baseT full-duplex,master)
status: active
trunk0: flags=28843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,NOINET6 mtu 1500
lladdr b8:ac:6f:91:48:da
priority: 0
trunk: trunkproto lacp
trunk id: [(8000,b8:ac:6f:91:48:da,402C,,),
 (0001,ec:30:91:25:c0:4f,03EA,,)]
trunkport bnx1 active,collecting,distributing
trunkport bnx0 active,collecting,distributing
groups: trunk
media: Ethernet autoselect
status: active

Now about the difference.

The A receives on both em0 and em1 and transmits on em0:

IFACE STATE DESC IPKTS IBYTES  IERRS  OPKTS OBYTES  OERRS  COLLS
em0   up:U2711  2859K  0   5593  5222K  0  0
em1   up:U2867  2343K  0 10   3226  0  0
trunk0up:U5578  5202K  0   5603  5225K  0  0

The B receives *only* on bnx0 and transmits *only* on bnx1:

IFACE STATE DESC IPKTS IBYTES  IERRS  OPKTS OBYTES  OERRS  COLLS
bnx0  up:U2873  2956K  0  2977  0  0
bnx1  up:U   5360  0   3119  2604K  0  0
trunk0up:U2878  2956K  0   3121  2605K  0  0

The only difference ifconfig shows, both em(4)s are master interfaces
on A, but only
bnx1 is a master interface on B (I haven't found any description of
master media option
in ifconfig man page, trunk man page saying about master but only
wrt failover mode).

Whole situation smells like trunk(4) receives on *all* master
interfaces, but transmits
on *first available* master.

The question here is, why both em(4)s are master interfaces on A, but
only bnx1 is
master interface on B?

Another question is, what is transmit hash policy for a trunk in LACP
mode? If it matters,
while testing on B, different MACs and different VLANs been used, but
effect is same:
bnx0 only receives, bnx1 only transmits.

Anybody with trunking experience please speak up.

Thanks,
Alexey



Re: defer routing table updates on link state changes

2013-09-13 Thread Alexey Suslikov
Reyk Floeter wrote:

 Yes, in theory if_index should be fixed and return a consistent number
 between 1 and the number of interfaces.  But this is obviously
 difficult and I'm not sure if it's worth the effort.  So the hack
 that you're going to remove was a best effort.  But putting another
 interface index abstraction layer in userland (via snmpd or some
 shared db) is just not the right way to do it.  We either have a
 reliable if_index from the kernel or we don't.  But inventing another
 thing in userland doesn't make sense to me.

If above theory doesn't dictate all interfaces must exist (it shouldn't
because of hot-plug interfaces), kernel can operate on fixed predefined
ifIndex table like this:

tun ifIndex (only have 256 of them because of unit_no):
  1 - 00:bd:xx:xx:xx:00 - tun0
256 - 00:bd:xx:xx:xx:ff - tun255

vether ifIndex (only have 65536 of them?):
257 - fe:e1:ba:d0:xx:xx - vether0
 65,792 - fe:e1:ba:d0:xx:xx - vether65535

physical ifIndex (claim to support ~1M of physical interfaces):
 65,793 - 00:25:90:xx:xx:aa - em0
 65,794 - 00:25:90:xx:xx:ab - em1
  1,179,906 - xx:xx:xx:xx:xx:xx - foo77

trunk ifIndex (claim to support ~17M of trunk interfaces, by unit_no):
  1,179,907 - xx:xx:xx:xx:xx:xx - trunk0
 19,005,699 - xx:xx:xx:xx:xx:xx - trunk1699

vlan ifIndex (claim to support ~280M of vlan interfaces, by unit_no):
 19,005,700 - xx:xx:xx:xx:xx:xx - vlan0
304,218,372 - xx:xx:xx:xx:xx:xx - vlan27999

and so on, up to 2,147,483,647.

IMO, cloners aren't so problematic (because of algorithmically controlled
enumeration and unit number assignment) as physical interfaces are.

I think, the best is to let ifIndexes be assigned to physical interfaces
via ifconfig, but let cloners to do their assignments automatically.

And do not let snmpd to operate on interface without an ifIndex: having no
ifIndex means no interface available.



octeon bits on 54.html

2013-08-10 Thread Alexey Suslikov
Hi tech@.

54.html says:

 Ubiquiti Networks EdgeRouter LITE (no local storage)

How should I read it: an EdgeRouter LITE variant with no local storage or
local storage is currently not supported?

Cheers,
Alexey



Re: drm bits on 54.html

2013-08-10 Thread Alexey Suslikov
On Sat, Aug 10, 2013 at 11:58 AM, Brad Smith b...@comstyle.com wrote:
 - Original message -
 Hi tech@.

 54.html says:

  Now mostly in sync with Linux 3.8.13

 But there's no such thing as Linux X.X.X, there's a Linux kernel X.X.X.

 But there is. The later is redundant. Linux is a kernel.

In geek world, maybe, but not in Real World (tm)

http://en.wikipedia.org/wiki/Linux



Re: drm bits on 54.html

2013-08-10 Thread Alexey Suslikov
On Sat, Aug 10, 2013 at 12:09 PM, Brad Smith b...@comstyle.com wrote:
 - Original message -
 On Sat, Aug 10, 2013 at 11:58 AM, Brad Smith b...@comstyle.com wrote:
  - Original message -
   Hi tech@.
  
   54.html says:
  
Now mostly in sync with Linux 3.8.13
  
   But there's no such thing as Linux X.X.X, there's a Linux kernel
   X.X.X.
 
  But there is. The later is redundant. Linux is a kernel.

 In geek world, maybe, but not in Real World (tm)

 http://en.wikipedia.org/wiki/Linux

 Yes, real world so often uses names and terms improperly. whats new.

http://www.gnu.org/gnu/linux-and-gnu.html says

Linux is the kernel: the program in the system that allocates the machine's
resources to the other programs that you run. The kernel is an essential part
of an operating system, but useless by itself; it can only function in the
context of a complete operating system. Linux is normally used in combination
with the GNU operating system: the whole system is basically GNU with Linux
added, or GNU/Linux. All the so-called “Linux” distributions are really
distributions of GNU/Linux.

So I think you're right about using Linux term. Sorry for a noise.



a.out in gcc-local(1)

2013-07-11 Thread Alexey Suslikov
Hi tech@

Just found no longer relevant block in gcc-local(1):

- On a.out platforms (i.e. vax), gcc uses a linker wrapper to write
  stubs that call global constructors and destructors.  Those platforms
  use gcc 2.95.3, and those calls can be traced using
  -Wl,-trace-ctors-dtors, using syslog_r(3).

Cheers,
Alexey



Re: Stop printing excessive numbers of ACPI wakeup devices

2013-06-01 Thread Alexey Suslikov
On Sun, Jun 2, 2013 at 12:05 AM, Theo de Raadt dera...@cvs.openbsd.org wrote:
 Mike Larkin mlarkin at azathoth.net writes:

  It's sometimes nice to know what devices can wake up a machine, and from 
  what
  sleep state. But I'm fine suppressing these also. Don't want this to end up
  being a bikeshed :)

 why not dnprintf them?

 good grief.  We are displaying the information because we still want to see
 it in dmesglogs so that we can improve suspend/resume ...

is there any possibility to end up with that information being under [...]?



Re: Stop printing excessive numbers of ACPI wakeup devices

2013-06-01 Thread Alexey Suslikov
On Sun, Jun 2, 2013 at 12:14 AM, Mark Kettenis mark.kette...@xs4all.nl wrote:
 Date: Sun, 2 Jun 2013 00:09:25 +0300
 From: Alexey Suslikov alexey.susli...@gmail.com

 On Sun, Jun 2, 2013 at 12:05 AM, Theo de Raadt dera...@cvs.openbsd.org 
 wrote:
  Mike Larkin mlarkin at azathoth.net writes:
 
   It's sometimes nice to know what devices can wake up a machine, and 
   from what
   sleep state. But I'm fine suppressing these also. Don't want this to 
   end up
   being a bikeshed :)
 
  why not dnprintf them?
 
  good grief.  We are displaying the information because we still want to see
  it in dmesglogs so that we can improve suspend/resume ...

 is there any possibility to end up with that information being under [...]?

 acpidump and disassemble the aml

 anyway, if somebody actually starts working on proper acpi wakeup
 support, we can temporarily enable printing them all again.

just an idea (I know more knobs are not good), but sysctl already have some
acpi related information (indirectly, tho), like

$ sysctl -a| grep -i acpi
kern.malloc.kmemnames=free,,devbuf,debug,pcb,routetbl,,fragtbl,,ifaddr,soopts,sysctl,,,ioctlops,iov,mount,,NFS_req,NFS_mount,,vnodes,namecache,UFS_quota,UFS_mount,shm,VM_map,sem,dirhash,ACPI,VM_pmapfile,file_desc,,proc,subproc,VFS_cluster,,,MFS_node,,,Export_Host,NFS_srvsock,,NFS_daemon,ip_moptions,in_multi,ether_multi,mrt,ISOFS_mount,ISOFS_node,MSDOSFS_mount,MSDOSFS_fat,MSDOSFS_node,ttys,exec,miscfs_mount,,pfkey_data,tdb,xform_data,,pagedep,inodedep,newblk,,,indirdep,VM_swap,,UVM_amap,UVM_aobj,,USB,USB_device,USB_HC,,memdesc,,,crypto_data,,IPsec_credsemuldata,ip6_options,NDP,,,temp,NTFS_mount,NTFS_node,NTFS_fnode,NTFS_dir,NTFS_hash,NTFS_attr,NTFS_data,NTFS_decomp,NTFS_vrun,kqueue,bluetooth,bwmeter,UDF_mount,UDF_file_entry,UDF_file_id,Bluetooth_HID,AGP_Memory,DRM
kern.malloc.kmemstat.ACPI=(inuse = 5429, calls = 20622, memuse = 638K,
limblocks = 0, mapblocks = 0, maxused = 660K, limit = 78644K, spare =
0, sizes = (16,32,64,128,256,512,2048))
kern.timecounter.hardware=acpihpet0
kern.timecounter.choice=i8254(0) acpihpet0(1000) acpitimer0(1000)
dummy(-100)

so maybe wakeup devices may end up under some sysctl path.



Question about MP safe audio/video

2013-05-24 Thread Alexey Suslikov
Hi tech@.

Are uvideo(4), bktr(4) and similar also MP safe or they somewhat different
in terms of a technique used to make audio MP safe?

Cheers,
Alexey



amd64errata.c,v 1.4

2013-05-20 Thread Alexey Suslikov
For our crash, v 1.4 of amd64errata.c is no-op unless we de-static
functions' prototypes.

acpiprt0 at acpi0: bus 0 (PCI0)
mpbios0 at bios0: Intel MP Specification 1.4
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD Phenom(tm) 9550 Quad-Core Processor, 3600.54 MHz
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,
MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CF
LUSH,MMX,FXSR,SSE,SSE2,SSE3,CX16,POPCHT,HXE,MMXX,FFXSR,LOHG,3DNOW2,3DNOW,LAHF,CM
PLEG,SVM,AMCR8,ABM,SSE4A,MASSE,3DNOWP

cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 1
6-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
kernel: protection fault trap, code=0
Stopped at amd64_errata_setmsr()+0x10: rdmsr
amd64_errata_setmsr() at amd64_errata_setmsr()+0x10

amd64_errata() at amd64 errata+0xc9
identifycpu() at identifycpu+0x729
cpu attach() at cpu_attach+0x2ce
config_attach() at config_attach+0x1d4
mpbios_cpu() at mpbios_cpu+0x5b
mpbios_scan() at mpbios_scan+0x355
config_attach() at config_attach+0x1d4
bios_attach() at bios_attach+0x296
config_attach() at config_attach+0x1d4
end trace frame: 0x81de9e30, count: 0
ddb{0}

ddb{0} trace
amd64_errata_setmsr() at amd64_errata_setmsr()+0x10

amd64_errata() at amd64_errata+0xc9
identifycpu() at identifycpu+0x729
cpu_attach() at cpu_attach+0x2ce
config_attach() at config_attach+0x1d4
mpbios_cpu() at mpbios_cpu+0x5b
mpbios_scan() at mpbios_scan+0x355
config_attach() at config_attach+0x1d4
bios_attach() at bios_attach+0x296
config_attach() at config_attach+0x1d4
mainbus_attach() at mainbus_attach+0x5b
config_attach() at config_attach+0x1d4
cpu_configure() at cpu_configure+0x17
main() at main+0x3f5
end trace frame: 0x0, count: -14
ddb{0}


Index: arch/amd64/amd64/amd64errata.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/amd64errata.c,v
retrieving revision 1.4
diff -u -p -u -p -r1.4 amd64errata.c
--- arch/amd64/amd64/amd64errata.c  20 May 2013 17:34:08 -  1.4
+++ arch/amd64/amd64/amd64errata.c  20 May 2013 20:17:00 -
@@ -129,8 +129,8 @@ static const uint8_t amd64_errata_set9[]
DA_C3, HY_D0, HY_D1, HY_D1_G34R1,  PH_E0, LN_B0, OINK
 };

-static int amd64_errata_setmsr(struct cpu_info *, errata_t *);
-static int amd64_errata_testmsr(struct cpu_info *, errata_t *);
+int amd64_errata_setmsr(struct cpu_info *, errata_t *);
+int amd64_errata_testmsr(struct cpu_info *, errata_t *);

 static errata_t errata[] = {
/*
Index: arch/i386/i386/amd64errata.c
===
RCS file: /cvs/src/sys/arch/i386/i386/amd64errata.c,v
retrieving revision 1.4
diff -u -p -u -p -r1.4 amd64errata.c
--- arch/i386/i386/amd64errata.c20 May 2013 17:34:08 -  1.4
+++ arch/i386/i386/amd64errata.c20 May 2013 20:17:01 -
@@ -129,8 +129,8 @@ static const uint8_t amd64_errata_set9[]
DA_C3, HY_D0, HY_D1, HY_D1_G34R1,  PH_E0, LN_B0, OINK
 };

-static int amd64_errata_setmsr(struct cpu_info *, errata_t *);
-static int amd64_errata_testmsr(struct cpu_info *, errata_t *);
+int amd64_errata_setmsr(struct cpu_info *, errata_t *);
+int amd64_errata_testmsr(struct cpu_info *, errata_t *);

 static errata_t errata[] = {
/*



Re: Possible relayd memory leak analysis

2013-05-17 Thread Alexey Suslikov
recent snaps don't have above mentioned problem. no sure what was the
cause, but leak is gone.

On Tue, Apr 9, 2013 at 1:47 AM, Alexey Suslikov
alexey.susli...@gmail.com wrote:
 hi tech@

 tools used:
 * ps auxwww | grep relayd
 * httperf --hog --server=192.168.5.201 --wsess=25,1000,0.1 --rate=50 
 --timeout=5

 target machine:
 OpenBSD 5.3-current (GENERIC.MP) #0: Sun Apr  7 15:14:10 EEST 2013
 *@*:/usr/src/sys/arch/amd64/compile/GENERIC.MP

 /etc/relayd.conf:

 ext_addr=192.168.5.201
 webhost1=192.168.5.202
 webhost2=192.168.5.203

 prefork 2

 table web { $webhost1 $webhost2 }

 http protocol proto_pool_http {
 header append $REMOTE_ADDR to X-Forwarded-For
 header append $SERVER_ADDR:$SERVER_PORT to X-Forwarded-By
 header change Connection to close
 }

 relay cluster_pool_http {
 listen on $ext_addr port www
 protocol proto_pool_http
 forward to web port www mode roundrobin check http /index.html
 host test.local code 200
 }

 cold ps auxwww:

 root 31403  0.0  0.1  1160  1916 ??  Ss12:21AM0:00.03
 relayd: parent (relayd)
 _relayd  18684  0.0  0.1  1044  2056 ??  S 12:21AM0:00.01
 relayd: pfe (relayd)
 _relayd  29554  0.0  0.1   912  1948 ??  S 12:21AM0:00.01
 relayd: hce (relayd)
 _relayd   7937  0.0  0.1  1108  2020 ??  S 12:21AM0:00.02
 relayd: relay (relayd)
 _relayd  28352  0.0  0.1  1108  2036 ??  S 12:21AM0:00.00
 relayd: relay (relayd)

 ps auxwww after 1st httperf run:

 _relayd  28352  4.1  0.6 10280 11672 ??  S 12:21AM0:08.83
 relayd: relay (relayd)
 _relayd   7937  4.8  0.6 10620 12004 ??  S 12:21AM0:09.17
 relayd: relay (relayd)
 root 31403  0.0  0.1  1160  1916 ??  Is12:21AM0:00.03
 relayd: parent (relayd)
 _relayd  18684  0.0  0.1  1044  2056 ??  S 12:21AM0:00.02
 relayd: pfe (relayd)
 _relayd  29554  0.0  0.1   912  1948 ??  S 12:21AM0:00.03
 relayd: hce (relayd)

 ps auxwww after 2nd httperf run:

 _relayd  28352  1.5  1.0 19424 20816 ??  S 12:21AM0:17.77
 relayd: relay (relayd)
 _relayd   7937  1.4  1.0 19724 21108 ??  S 12:21AM0:18.11
 relayd: relay (relayd)
 root 31403  0.0  0.1  1160  1916 ??  Is12:21AM0:00.03
 relayd: parent (relayd)
 _relayd  18684  0.0  0.1  1044  2056 ??  S 12:21AM0:00.02
 relayd: pfe (relayd)
 _relayd  29554  0.0  0.1   912  1952 ??  S 12:21AM0:00.05
 relayd: hce (relayd)

 on busy production setup relayd continuously leaks and eventually crashes.



cvsweb says 'No viewable change' for i915_drv.c diffs

2013-05-15 Thread Alexey Suslikov
Hi.

Try this
http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/drm/i915/i915_drv.c.diff?r1=1.26;r2=1.27;f=h

and, for instance, this
http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/drm/i915/i915_dma.c.diff?r1=1.6;r2=1.7;f=h

Former says No viewable change. I think it isn't normal. Am I wrong?

Cheers,
Alexey



Re: cvsweb says 'No viewable change' for i915_drv.c diffs

2013-05-15 Thread Alexey Suslikov
googled gnu cvs utf diff and found this

http://stackoverflow.com/questions/778291/how-do-i-diff-utf-16-files-with-gnu-diff

On Wed, May 15, 2013 at 2:03 PM, Stuart Henderson st...@openbsd.org wrote:
 On 2013/05/15 11:53, Stuart Henderson wrote:
 On 2013/05/15 10:43, Alexey E. Suslikov wrote:
  Mark Kettenis mark.kettenis at xs4all.nl writes:
 
Try this
   
  http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/drm/i915/i915_drv.c.diff?r1=1.26;r2=1.27;f=h
   
and, for instance, this
   
  http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/drm/i915/i915_dma.c.diff?r1=1.6;r2=1.7;f=h
   
Former says No viewable change. I think it isn't normal. Am I wrong?
  
   Yes that's very annoying.  I suspect cvsweb has problems with the UTF8
   characters in the copyright header.
 
  cvsweb operates on individual diff chunks while preparing
  viewable output, right?
 
  if so, and you are right about UTF8, only one of these chunks
  is a showstopper.
 
  maybe cvsweb may say No viewable change for a problematic
  chunk only, instead of completely freaking out.
 

 it's not cvsweb.

 $ rcsdiff -u -r1.26 -r1.27 /cvs/src/sys/dev/pci/drm/i915/i915_drv.c,v
 ===
 RCS file: /cvs/src/sys/dev/pci/drm/i915/i915_drv.c,v
 retrieving revision 1.26
 retrieving revision 1.27
 diff -u -r1.26 -r1.27


 ...and yes, it is due to the UTF8 characters: replacing them in the ,v
 file lets it work.




Re: external ip/tcp (sysctl) variables

2013-04-08 Thread Alexey Suslikov
 RCS file: /home/ncvs/src/sys/netinet/ip_var.h,v
 retrieving revision 1.44
 diff -u -p -r1.44 ip_var.h
 --- netinet/ip_var.h 16 Jul 2012 18:05:36 - 1.44
 +++ netinet/ip_var.h 8 Apr 2013 13:23:23 -
 @@ -149,8 +149,20 @@ extern struct ipstat ipstat;
  extern LIST_HEAD(ipqhead, ipq) ipq; /* ip reass. queue */
  extern int ip_defttl; /* default IP ttl */

 +extern struct socket *ip_mrouter; /* multicast routing daemon */
 +
  extern int ip_mtudisc; /* mtu discovery */
  extern u_int ip_mtudisc_timeout; /* seconds to timeout mtu discovery */
 +
 +extern int ipport_firstauto; /* min port for port allocation */
 +extern int ipport_lastauto; /* max port for port allocation */
 +extern int ipport_hifirstauto; /* min dynamic/private port number */
 +extern int ipport_hilastauto; /* max dynamic/private port number */
 +extern int encdebug; /* enable message reporting */
 +extern int ipforwarding; /* enable IP forwarding */
 +extern int ipmforwarding; /* enable multicast forwarding */

previously, ipmforwarding and ip_mrouter were under #ifdef MROUTING

is it normal to have them outside mentioned #ifdef in your diff?



Possible relayd memory leak analysis

2013-04-08 Thread Alexey Suslikov
hi tech@

tools used:
* ps auxwww | grep relayd
* httperf --hog --server=192.168.5.201 --wsess=25,1000,0.1 --rate=50 --timeout=5

target machine:
OpenBSD 5.3-current (GENERIC.MP) #0: Sun Apr  7 15:14:10 EEST 2013
*@*:/usr/src/sys/arch/amd64/compile/GENERIC.MP

/etc/relayd.conf:

ext_addr=192.168.5.201
webhost1=192.168.5.202
webhost2=192.168.5.203

prefork 2

table web { $webhost1 $webhost2 }

http protocol proto_pool_http {
header append $REMOTE_ADDR to X-Forwarded-For
header append $SERVER_ADDR:$SERVER_PORT to X-Forwarded-By
header change Connection to close
}

relay cluster_pool_http {
listen on $ext_addr port www
protocol proto_pool_http
forward to web port www mode roundrobin check http /index.html
host test.local code 200
}

cold ps auxwww:

root 31403  0.0  0.1  1160  1916 ??  Ss12:21AM0:00.03
relayd: parent (relayd)
_relayd  18684  0.0  0.1  1044  2056 ??  S 12:21AM0:00.01
relayd: pfe (relayd)
_relayd  29554  0.0  0.1   912  1948 ??  S 12:21AM0:00.01
relayd: hce (relayd)
_relayd   7937  0.0  0.1  1108  2020 ??  S 12:21AM0:00.02
relayd: relay (relayd)
_relayd  28352  0.0  0.1  1108  2036 ??  S 12:21AM0:00.00
relayd: relay (relayd)

ps auxwww after 1st httperf run:

_relayd  28352  4.1  0.6 10280 11672 ??  S 12:21AM0:08.83
relayd: relay (relayd)
_relayd   7937  4.8  0.6 10620 12004 ??  S 12:21AM0:09.17
relayd: relay (relayd)
root 31403  0.0  0.1  1160  1916 ??  Is12:21AM0:00.03
relayd: parent (relayd)
_relayd  18684  0.0  0.1  1044  2056 ??  S 12:21AM0:00.02
relayd: pfe (relayd)
_relayd  29554  0.0  0.1   912  1948 ??  S 12:21AM0:00.03
relayd: hce (relayd)

ps auxwww after 2nd httperf run:

_relayd  28352  1.5  1.0 19424 20816 ??  S 12:21AM0:17.77
relayd: relay (relayd)
_relayd   7937  1.4  1.0 19724 21108 ??  S 12:21AM0:18.11
relayd: relay (relayd)
root 31403  0.0  0.1  1160  1916 ??  Is12:21AM0:00.03
relayd: parent (relayd)
_relayd  18684  0.0  0.1  1044  2056 ??  S 12:21AM0:00.02
relayd: pfe (relayd)
_relayd  29554  0.0  0.1   912  1952 ??  S 12:21AM0:00.05
relayd: hce (relayd)

on busy production setup relayd continuously leaks and eventually crashes.



SSE4.2 CRC32 question

2013-03-27 Thread Alexey Suslikov
Hi tech@.

Can OpenBSD use SSE4.2 CRC32 (found on Core i7) to speedup
TCP/IP checksum calculations?

Cheers,
Alexey



Re: goodbye to some isa devices

2013-03-27 Thread Alexey Suslikov
On Wed, Mar 27, 2013 at 10:04 PM, Miod Vallat m...@online.fr wrote:
 Not sure about ancient 3Com's, but they are Ethernet at
 least, in contract to Token-Ring device like tr*.

 Do we support Token-Ring?

 We used to, on TRopic boards, but since public documentation for TR
 hardware amounts to zilch, and there is no interest in changing this
 situation, it was eventually removed from the tree to clear the way of
 other changes.

And with no TR stack, is there any reason for
sys/arch/i386/conf/GENERIC to contain these

#tr0at isa? port 0xa20 iomem 0xd8000# IBM TROPIC based Token-Ring
#tr1at isa? port 0xa24 iomem 0xd# IBM TROPIC based Token-Ring
#tr*at isa? # 3COM TROPIC based Token-Ring

?



Re: goodbye to some isa devices

2013-03-27 Thread Alexey Suslikov
On Wed, Mar 27, 2013 at 10:24 PM, Miod Vallat m...@online.fr wrote:
  Do we support Token-Ring?
 
  We used to, on TRopic boards, but since public documentation for TR
  hardware amounts to zilch, and there is no interest in changing this
  situation, it was eventually removed from the tree to clear the way of
  other changes.

 And with no TR stack, is there any reason for
 sys/arch/i386/conf/GENERIC to contain these

 #tr0  at isa? port 0xa20 iomem 0xd8000# IBM TROPIC based Token-Ring
 #tr1  at isa? port 0xa24 iomem 0xd# IBM TROPIC based Token-Ring
 #tr*  at isa? # 3COM TROPIC based Token-Ring

 ?

 Definitely not, this is a leftover of the token ring pruning. Thanks for
 noticing!

btw, if you guys still looking for something to disable
in sys/arch/i386/conf/RAMDISK_CD, take a look on these

ie0 at isa? port 0x360 iomem 0xd irq 7  # StarLAN and 3C507
le0 at isa? port 0x360 irq 15 drq 6 # IsoLan, NE2100, and DEPCA
le* at isapnp?



5.3 lyrics

2013-03-25 Thread Alexey Suslikov
hi tech@.

despite of 5.3 lyrics being released recently,
53.html says opposite. is it normal?

cheers,
alexey



Re: Threads related SIGSEGV in random.c (diff, v2)

2013-03-14 Thread Alexey Suslikov
On Thu, Mar 14, 2013 at 6:48 PM, Ted Unangst t...@tedunangst.com wrote:
 On Thu, Mar 14, 2013 at 17:24, Antoine Jacoutot wrote:
 On Thu, Mar 14, 2013 at 11:41:52AM -0400, Ted Unangst wrote:
 On Thu, Mar 14, 2013 at 14:30, Antoine Jacoutot wrote:

  FYI I am seeing a somehow similar crash when using sysutils/bacula (both
  5.2 and 5.3).
  It is 100% reproducible on my setup. Obviously painful since it means I
  cannot run backups anymore...

 The following is brought to you without testing or warranty. It did
 compile at least once though.

 Awesome, thanks! I ran several batches of concurrent backups and I cannot
 reproduce the crash anymore :-)
 I'm going to run with that patch for the time being... if I spot any
 regression, I'll let you know.

 Couple fixes. In some error cases, there are early returns I didn't
 notice before. Fixed diff below, though I don't think a correct
 program should be affected.

 Alexey, sorry, I didn't get to your final diff before. It's very
 similar to the diff below, so you were on the right track. One thing
 that's different is you created unique special functions.

Thanks Ted. Glad to see this diff back.

I stopped pushing the diff because of zero feedback.

If you don't mind, put some credit to Roman Kravchuk
when you will commit, as he did most of work. I just
pushed diff.



Re: savecore on swap-less amd64 box

2012-12-17 Thread Alexey Suslikov
On Mon, Dec 17, 2012 at 2:44 PM, Mark Kettenis mark.kette...@xs4all.nl wrote:
 Date: Mon, 17 Dec 2012 14:14:40 +0200
 From: Alexey Suslikov alexey.susli...@gmail.com

 Hello tech@.

 On swap-less amd64 box using 20121213 amd64 snap, I have noticed a difference
 in how savecore behaves on SP and MP kernels.

 During boot, I see

 OpenBSD 5.2-current (GENERIC) #6: Wed Dec 12 23:16:44 MST 2012
 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC

 savecore: /dev/sd0b: Device not configured

 OpenBSD 5.2-current (GENERIC.MP) #5: Wed Dec 12 23:22:46 MST 2012
 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

 savecore: can't find device 0/127

 $ cat /etc/fstab
 93e3a680795f1b55.a / ffs rw,softdep 1 1

 Is it normal or I have missed something?

 reboot that GENERIC.MP kernel once more

yep. that fixed the problem.

just curious, what was it and why?

Cheers,
Alexey



Re: MBA remove unneeded quirk

2012-12-04 Thread Alexey Suslikov
On Tue, Dec 4, 2012 at 2:38 PM, Alexandre Ratchov a...@caoua.org wrote:
 On Sat, Dec 01, 2012 at 05:14:46PM +0800, Ray Lai wrote:
 I'm not sure why jakemsr's diff[1] has AZ_QRK_GPIO_UNMUTE_1, my MBA
 works fine without it. I've tested both left and right channels on both
 speakers and headphones.

 -Ray-

 [1]: http://marc.info/?l=openbsd-miscm=128919130029011w=2

 Index: dev/pci/azalia_codec.c
 ===
 RCS file: /home/cvs/src/sys/dev/pci/azalia_codec.c,v
 retrieving revision 1.152
 diff -u -p -r1.152 azalia_codec.c
 --- dev/pci/azalia_codec.c30 Nov 2012 12:05:45 -  1.152
 +++ dev/pci/azalia_codec.c1 Dec 2012 08:27:31 -
 @@ -67,8 +67,7 @@ azalia_codec_init_vtbl(codec_t *this)
   case 0x10134206:
   this-name = Cirrus Logic CS4206;
   if (this-subid == 0xcb8910de) {/* APPLE_MBA3_1 */
 - this-qrks |= AZ_QRK_GPIO_UNMUTE_1 |
 - AZ_QRK_GPIO_UNMUTE_3;
 + this-qrks |= AZ_QRK_GPIO_UNMUTE_3;
   }
   break;
   case 0x10ec0260:


 Hey,

 Linux hda driver seems to use gpio 1 and 3 by default for most
 apple products. Does the gpio 1 quirk hurts in any way? If it
 doesn't, I'd leave it unless I'm missing the reason why it's not
 needed.

 BTW, did you get any test reports?

 -- Alexandre

I agree. Other models/codecs/wiring may require both 1 and 3 gpio,
so more general case is preferred here.

Cheers,
Alexey



Re: [PATCH, TEST] Make functions in random.c thread safe

2012-10-10 Thread Alexey Suslikov
Hi.

Me and Roman are curious about zero comments on this.

We'll try to improve the diff if it is not ok. Just let us know.

Anyone? :)

On Wed, Oct 3, 2012 at 4:06 PM, Alexey Suslikov
alexey.susli...@gmail.com wrote:
 Hi.

 Is there any progress/comments on this?

 On Fri, Sep 28, 2012 at 11:29 PM, Alexey Suslikov
 alexey.susli...@gmail.com wrote:
 Hi.

 With input from tedu@, guenther@ and others, below are:
 1) test case;
 2) backtrace for test case;
 3) locking diff;
 4) dmesg (amd64 GENERIC.MP built from 2012-09-28 CVS).

 Diff introduces no changes to srandomdev(): correct me if I'm wrong,
 but no mutex can be used since sysctl can sleep.

 Rebuild and reinstall in src/lib/librthread and src/lib/libc after applying
 the diff.

 Expect test case (and Kannel port of course) not crashing after rebuild
 and reinstall.

 Cheers,
 Alexey

 1) test case.

 #include pthread.h
 #include stdio.h
 #include stdlib.h
 #include assert.h
 #include unistd.h

 #define NUM_THREADS1800

 void *TaskCode(void *argument)
 {
 struct timeval  tv;

 gettimeofday(tv, 0);
 srandom((getpid()  16) ^ getuid() ^ tv.tv_sec ^ tv.tv_usec);

 return NULL;
 }

 int main(void)
 {
 pthread_t threads[NUM_THREADS];
 int thread_args[NUM_THREADS];
 int rc, i;

 /* create all threads */
 for (i=0; iNUM_THREADS; ++i) {
 thread_args[i] = i;
 rc = pthread_create(threads[i], NULL, TaskCode, (void *) 
 thread_args[i]);
 assert(0 == rc);
 }

 /* wait for all threads to complete */
 for (i=0; iNUM_THREADS; ++i) {
 rc = pthread_join(threads[i], NULL);
 assert(0 == rc);
 }

 printf(Test srandom success\n);
 exit(EXIT_SUCCESS);
 }

 2) backtrace for test case.

 Program received signal SIGSEGV, Segmentation fault.
 [Switching to thread 1030380]
 0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387
 387 *fptr += *rptr;

 (gdb) bt
 #0  0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387
 #1  0x19d34d619169 in srandom (x=Variable x is not available.
 ) at /usr/src/lib/libc/stdlib/random.c:216
 #2  0x19d14fe1 in TaskCode (argument=0x7f7ea004) at
 test_srandom.c:14
 #3  0x19d34999d11e in _rthread_start (v=Variable v is not available.
 ) at /usr/src/lib/librthread/rthread.c:122
 #4  0x19d34d5f0f9b in __tfork_thread () at
 /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
 Cannot access memory at address 0x19d344efb000

 3) locking diff.

 Index: lib/libc/include/thread_private.h
 ===
 RCS file: /cvs/src/lib/libc/include/thread_private.h,v
 retrieving revision 1.25
 diff -u -p -r1.25 thread_private.h
 --- lib/libc/include/thread_private.h   16 Oct 2011 06:29:56 -  1.25
 +++ lib/libc/include/thread_private.h   27 Sep 2012 10:48:45 -
 @@ -172,4 +172,16 @@ void   _thread_arc4_unlock(void);
 _thread_arc4_unlock();\
 } while (0)

 +void   _thread_random_lock(void);
 +void   _thread_random_unlock(void);
 +
 +#define _RANDOM_LOCK() do {\
 +   if (__isthreaded)   \
 +   _thread_random_lock();  \
 +   } while (0)
 +#define _RANDOM_UNLOCK()   do {\
 +   if (__isthreaded)   \
 +   _thread_random_unlock();\
 +   } while (0)
 +
  #endif /* _THREAD_PRIVATE_H_ */
 Index: lib/libc/stdlib/random.c
 ===
 RCS file: /cvs/src/lib/libc/stdlib/random.c,v
 retrieving revision 1.17
 diff -u -p -r1.17 random.c
 --- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 -   1.17
 +++ lib/libc/stdlib/random.c27 Sep 2012 10:48:45 -
 @@ -35,6 +35,10 @@
  #include stdio.h
  #include stdlib.h
  #include unistd.h
 +#include thread_private.h
 +
 +static void srandom_unlocked(unsigned int);
 +static long random_unlocked(void);

  /*
   * random.c:
 @@ -186,8 +190,8 @@ static int rand_sep = SEP_3;
   * introduced by the L.C.R.N.G.  Note that the initialization of randtbl[]
   * for default usage relies on values produced by this routine.
   */
 -void
 -srandom(unsigned int x)
 +static void
 +srandom_unlocked(unsigned int x)
  {
 int i;
 int32_t test;
 @@ -213,10 +217,18 @@ srandom(unsigned int x)
 fptr = state[rand_sep];
 rptr = state[0];
 for (i = 0; i  10 * rand_deg; i++)
 -   (void)random();
 +   (void)random_unlocked();
 }
  }

 +void
 +srandom

Re: [PATCH, TEST] Make functions in random.c thread safe

2012-10-03 Thread Alexey Suslikov
Hi.

Is there any progress/comments on this?

On Fri, Sep 28, 2012 at 11:29 PM, Alexey Suslikov
alexey.susli...@gmail.com wrote:
 Hi.

 With input from tedu@, guenther@ and others, below are:
 1) test case;
 2) backtrace for test case;
 3) locking diff;
 4) dmesg (amd64 GENERIC.MP built from 2012-09-28 CVS).

 Diff introduces no changes to srandomdev(): correct me if I'm wrong,
 but no mutex can be used since sysctl can sleep.

 Rebuild and reinstall in src/lib/librthread and src/lib/libc after applying
 the diff.

 Expect test case (and Kannel port of course) not crashing after rebuild
 and reinstall.

 Cheers,
 Alexey

 1) test case.

 #include pthread.h
 #include stdio.h
 #include stdlib.h
 #include assert.h
 #include unistd.h

 #define NUM_THREADS1800

 void *TaskCode(void *argument)
 {
 struct timeval  tv;

 gettimeofday(tv, 0);
 srandom((getpid()  16) ^ getuid() ^ tv.tv_sec ^ tv.tv_usec);

 return NULL;
 }

 int main(void)
 {
 pthread_t threads[NUM_THREADS];
 int thread_args[NUM_THREADS];
 int rc, i;

 /* create all threads */
 for (i=0; iNUM_THREADS; ++i) {
 thread_args[i] = i;
 rc = pthread_create(threads[i], NULL, TaskCode, (void *) 
 thread_args[i]);
 assert(0 == rc);
 }

 /* wait for all threads to complete */
 for (i=0; iNUM_THREADS; ++i) {
 rc = pthread_join(threads[i], NULL);
 assert(0 == rc);
 }

 printf(Test srandom success\n);
 exit(EXIT_SUCCESS);
 }

 2) backtrace for test case.

 Program received signal SIGSEGV, Segmentation fault.
 [Switching to thread 1030380]
 0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387
 387 *fptr += *rptr;

 (gdb) bt
 #0  0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387
 #1  0x19d34d619169 in srandom (x=Variable x is not available.
 ) at /usr/src/lib/libc/stdlib/random.c:216
 #2  0x19d14fe1 in TaskCode (argument=0x7f7ea004) at
 test_srandom.c:14
 #3  0x19d34999d11e in _rthread_start (v=Variable v is not available.
 ) at /usr/src/lib/librthread/rthread.c:122
 #4  0x19d34d5f0f9b in __tfork_thread () at
 /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
 Cannot access memory at address 0x19d344efb000

 3) locking diff.

 Index: lib/libc/include/thread_private.h
 ===
 RCS file: /cvs/src/lib/libc/include/thread_private.h,v
 retrieving revision 1.25
 diff -u -p -r1.25 thread_private.h
 --- lib/libc/include/thread_private.h   16 Oct 2011 06:29:56 -  1.25
 +++ lib/libc/include/thread_private.h   27 Sep 2012 10:48:45 -
 @@ -172,4 +172,16 @@ void   _thread_arc4_unlock(void);
 _thread_arc4_unlock();\
 } while (0)

 +void   _thread_random_lock(void);
 +void   _thread_random_unlock(void);
 +
 +#define _RANDOM_LOCK() do {\
 +   if (__isthreaded)   \
 +   _thread_random_lock();  \
 +   } while (0)
 +#define _RANDOM_UNLOCK()   do {\
 +   if (__isthreaded)   \
 +   _thread_random_unlock();\
 +   } while (0)
 +
  #endif /* _THREAD_PRIVATE_H_ */
 Index: lib/libc/stdlib/random.c
 ===
 RCS file: /cvs/src/lib/libc/stdlib/random.c,v
 retrieving revision 1.17
 diff -u -p -r1.17 random.c
 --- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 -   1.17
 +++ lib/libc/stdlib/random.c27 Sep 2012 10:48:45 -
 @@ -35,6 +35,10 @@
  #include stdio.h
  #include stdlib.h
  #include unistd.h
 +#include thread_private.h
 +
 +static void srandom_unlocked(unsigned int);
 +static long random_unlocked(void);

  /*
   * random.c:
 @@ -186,8 +190,8 @@ static int rand_sep = SEP_3;
   * introduced by the L.C.R.N.G.  Note that the initialization of randtbl[]
   * for default usage relies on values produced by this routine.
   */
 -void
 -srandom(unsigned int x)
 +static void
 +srandom_unlocked(unsigned int x)
  {
 int i;
 int32_t test;
 @@ -213,10 +217,18 @@ srandom(unsigned int x)
 fptr = state[rand_sep];
 rptr = state[0];
 for (i = 0; i  10 * rand_deg; i++)
 -   (void)random();
 +   (void)random_unlocked();
 }
  }

 +void
 +srandom(unsigned int x)
 +{
 +   _RANDOM_LOCK();
 +   srandom_unlocked(x);
 +   _RANDOM_UNLOCK();
 +}
 +
  /*
   * srandomdev:
   *
 @@ -273,12 +285,15 @@ initstate(u_int seed, char *arg_state, s
  {
 char *ostate

[PATCH, TEST] Make functions in random.c thread safe

2012-09-28 Thread Alexey Suslikov
Hi.

With input from tedu@, guenther@ and others, below are:
1) test case;
2) backtrace for test case;
3) locking diff;
4) dmesg (amd64 GENERIC.MP built from 2012-09-28 CVS).

Diff introduces no changes to srandomdev(): correct me if I'm wrong,
but no mutex can be used since sysctl can sleep.

Rebuild and reinstall in src/lib/librthread and src/lib/libc after applying
the diff.

Expect test case (and Kannel port of course) not crashing after rebuild
and reinstall.

Cheers,
Alexey

1) test case.

#include pthread.h
#include stdio.h
#include stdlib.h
#include assert.h
#include unistd.h

#define NUM_THREADS1800

void *TaskCode(void *argument)
{
struct timeval  tv;

gettimeofday(tv, 0);
srandom((getpid()  16) ^ getuid() ^ tv.tv_sec ^ tv.tv_usec);

return NULL;
}

int main(void)
{
pthread_t threads[NUM_THREADS];
int thread_args[NUM_THREADS];
int rc, i;

/* create all threads */
for (i=0; iNUM_THREADS; ++i) {
thread_args[i] = i;
rc = pthread_create(threads[i], NULL, TaskCode, (void *) 
thread_args[i]);
assert(0 == rc);
}

/* wait for all threads to complete */
for (i=0; iNUM_THREADS; ++i) {
rc = pthread_join(threads[i], NULL);
assert(0 == rc);
}

printf(Test srandom success\n);
exit(EXIT_SUCCESS);
}

2) backtrace for test case.

Program received signal SIGSEGV, Segmentation fault.
[Switching to thread 1030380]
0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387
387 *fptr += *rptr;

(gdb) bt
#0  0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387
#1  0x19d34d619169 in srandom (x=Variable x is not available.
) at /usr/src/lib/libc/stdlib/random.c:216
#2  0x19d14fe1 in TaskCode (argument=0x7f7ea004) at
test_srandom.c:14
#3  0x19d34999d11e in _rthread_start (v=Variable v is not available.
) at /usr/src/lib/librthread/rthread.c:122
#4  0x19d34d5f0f9b in __tfork_thread () at
/usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
Cannot access memory at address 0x19d344efb000

3) locking diff.

Index: lib/libc/include/thread_private.h
===
RCS file: /cvs/src/lib/libc/include/thread_private.h,v
retrieving revision 1.25
diff -u -p -r1.25 thread_private.h
--- lib/libc/include/thread_private.h   16 Oct 2011 06:29:56 -  1.25
+++ lib/libc/include/thread_private.h   27 Sep 2012 10:48:45 -
@@ -172,4 +172,16 @@ void   _thread_arc4_unlock(void);
_thread_arc4_unlock();\
} while (0)

+void   _thread_random_lock(void);
+void   _thread_random_unlock(void);
+
+#define _RANDOM_LOCK() do {\
+   if (__isthreaded)   \
+   _thread_random_lock();  \
+   } while (0)
+#define _RANDOM_UNLOCK()   do {\
+   if (__isthreaded)   \
+   _thread_random_unlock();\
+   } while (0)
+
 #endif /* _THREAD_PRIVATE_H_ */
Index: lib/libc/stdlib/random.c
===
RCS file: /cvs/src/lib/libc/stdlib/random.c,v
retrieving revision 1.17
diff -u -p -r1.17 random.c
--- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 -   1.17
+++ lib/libc/stdlib/random.c27 Sep 2012 10:48:45 -
@@ -35,6 +35,10 @@
 #include stdio.h
 #include stdlib.h
 #include unistd.h
+#include thread_private.h
+
+static void srandom_unlocked(unsigned int);
+static long random_unlocked(void);

 /*
  * random.c:
@@ -186,8 +190,8 @@ static int rand_sep = SEP_3;
  * introduced by the L.C.R.N.G.  Note that the initialization of randtbl[]
  * for default usage relies on values produced by this routine.
  */
-void
-srandom(unsigned int x)
+static void
+srandom_unlocked(unsigned int x)
 {
int i;
int32_t test;
@@ -213,10 +217,18 @@ srandom(unsigned int x)
fptr = state[rand_sep];
rptr = state[0];
for (i = 0; i  10 * rand_deg; i++)
-   (void)random();
+   (void)random_unlocked();
}
 }

+void
+srandom(unsigned int x)
+{
+   _RANDOM_LOCK();
+   srandom_unlocked(x);
+   _RANDOM_UNLOCK();
+}
+
 /*
  * srandomdev:
  *
@@ -273,12 +285,15 @@ initstate(u_int seed, char *arg_state, s
 {
char *ostate = (char *)(state[-1]);

+   _RANDOM_LOCK();
if (rand_type == TYPE_0)
state[-1] = rand_type;
else
state[-1] = MAX_TYPES * (rptr - state) + rand_type;
-   if (n  BREAK_0)
+   if (n  BREAK_0) {
+   

Re: Threads related SIGSEGV in random.c (diff, v2)

2012-09-27 Thread Alexey Suslikov
On Thursday, September 27, 2012, Philip Guenther wrote:

 On Thu, 27 Sep 2012, Alexey Suslikov wrote:
  Removing only local variables part reverts us to previous behavior (i.e.
  crashes).

 My guess is your program is calling srandom(), srandomdev(), initstate()
 or setstate() as well.  Your diff doesn't protect the alteration of state,
 end_ptr, fptr, and rptr on those paths, so a call to initstate() while
 another thread is in random() can walk fptr and/or rptr out of the state
 array.  Add the necessary locking in them and run your tests again.

 If not, well, crank up your debugging skills.  What was the line of code
 that actually triggered the crash?  Where did the bogus pointer come from?


Crash:

Program received signal SIGSEGV, Segmentation fault.
[Switching to thread 1006387]
0x0cb33345cf6e in random () at /usr/src/lib/libc/stdlib/random.c:387
387 *fptr += *rptr;

Back trace:

Thread 10 (thread 1003160):
#0  0x0cb33344135a in _thread_sys___thrsleep () at stdin:2
#1  0x0cb3315fac2a in pthread_cond_wait (condp=0xcb32a79c4b0,
mutexp=Variable mutexp is not available.
) at /usr/src/lib/librthread/rthread_sync.c:500
#2  0x0cb129f836ba in gwlist_consume () from /usr/local/sbin/bearerbox
#3  0x0cb129f121f1 in boxc_sender () from /usr/local/sbin/bearerbox
#4  0x0cb129f828dd in new_thread () from /usr/local/sbin/bearerbox
#5  0x0cb3315f911e in _rthread_start (v=Variable v is not available.
) at /usr/src/lib/librthread/rthread.c:122
#6  0x0cb333434f9b in __tfork_thread () at
/usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
Cannot access memory at address 0xcb32b27c000
0x0cb33345cf6e  387 *fptr += *rptr;



  I'm starting to believe that static globals are not good.

 They are incredibly good at what they do.  If you're trying to say that
 they fundamentally can't be thread-safe, you'll need some extraordinary
 evidence for such a claim.


What good they do?

Cheers,
Alexey



Threads related SIGSEGV in random.c (diff, v2)

2012-09-27 Thread Alexey Suslikov
On Thursday, September 27, 2012, Alexey Suslikov wrote:

 On Thursday, September 27, 2012, Philip Guenther wrote:

 On Thu, 27 Sep 2012, Alexey Suslikov wrote:
  Removing only local variables part reverts us to previous behavior (i.e.
  crashes).

 My guess is your program is calling srandom(), srandomdev(), initstate()
 or setstate() as well.  Your diff doesn't protect the alteration of state,
 end_ptr, fptr, and rptr on those paths, so a call to initstate() while
 another thread is in random() can walk fptr and/or rptr out of the state
 array.  Add the necessary locking in them and run your tests again.

 If not, well, crank up your debugging skills.  What was the line of code
 that actually triggered the crash?  Where did the bogus pointer come from?


 Crash:

 Program received signal SIGSEGV, Segmentation fault.
 [Switching to thread 1006387]
 0x0cb33345cf6e in random () at /usr/src/lib/libc/stdlib/random.c:387
 387 *fptr += *rptr;

 Back trace:

 Thread 10 (thread 1003160):
 #0  0x0cb33344135a in _thread_sys___thrsleep () at stdin:2
 #1  0x0cb3315fac2a in pthread_cond_wait (condp=0xcb32a79c4b0,
 mutexp=Variable mutexp is not available.
 ) at /usr/src/lib/librthread/rthread_sync.c:500
 #2  0x0cb129f836ba in gwlist_consume () from /usr/local/sbin/bearerbox
 #3  0x0cb129f121f1 in boxc_sender () from /usr/local/sbin/bearerbox
 #4  0x0cb129f828dd in new_thread () from /usr/local/sbin/bearerbox
 #5  0x0cb3315f911e in _rthread_start (v=Variable v is not available.
 ) at /usr/src/lib/librthread/rthread.c:122
 #6  0x0cb333434f9b in __tfork_thread () at
 /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
 Cannot access memory at address 0xcb32b27c000
 0x0cb33345cf6e  387 *fptr += *rptr;



  I'm starting to believe that static globals are not good.

 They are incredibly good at what they do.  If you're trying to say that
 they fundamentally can't be thread-safe, you'll need some extraordinary
 evidence for such a claim.


 What good they do?


Philip, can you help us to write threaded test case (spawning a number of
threads each calling random)?



Re: Threads related SIGSEGV in random.c (diff, v2)

2012-09-26 Thread Alexey Suslikov
Hi.

Any news on that?

On Friday, September 21, 2012, Alexey Suslikov wrote:

 On Fri, Sep 21, 2012 at 10:36 AM, Alexey Suslikov
 alexey.susli...@gmail.com javascript:; wrote:
  On Wed, Sep 19, 2012 at 10:24 PM, Ted Unangst 
  t...@tedunangst.comjavascript:;
 wrote:
  On Wed, Sep 19, 2012 at 18:50, Alexey Suslikov wrote:
  On Wednesday, September 19, 2012, Theo de Raadt wrote:
 
   arc4random() is also thread-safe (it has interal locking) and very
   desirable for other reasons. But no way to save state.
 
  The last part of this is intentional.  Saving the state of pseudo
  random number generators is a stupid concept from the 80's.
 
 
  I see many rng functions behaving very differently. Is it a good idea
  to create a common locking layer on top of need-to-be-safe rng
  functions? Or we should deal only with original problem (and only
  port random.c code from netbsd)?
 
  just slap a mutex around it.
 
  With the diff below Kannel no longer crashes. Only protecting random()
  for now.
 
  Make random() thread-safe by surrounding real call with a mutex locking.
  Found by and diff from Roman Kravchuk. Mainly from NetBSD.

 Sorry. Here is correct diff.

 We kinda unsure about the approach. For now, we follow arc4random pattern.
 Should we use generic _thread_mutex_lock/_thread_mutex_unlock instead?

 Index: lib/libc/include/thread_private.h
 ===
 RCS file: /cvs/src/lib/libc/include/thread_private.h,v
 retrieving revision 1.25
 diff -u -p -r1.25 thread_private.h
 --- lib/libc/include/thread_private.h   16 Oct 2011 06:29:56 -
  1.25
 +++ lib/libc/include/thread_private.h   21 Sep 2012 07:59:34 -
 @@ -172,4 +172,16 @@ void   _thread_arc4_unlock(void);
 _thread_arc4_unlock();\
 } while (0)

 +void   _thread_random_lock(void);
 +void   _thread_random_unlock(void);
 +
 +#define _RANDOM_LOCK() do {\
 +   if (__isthreaded)   \
 +   _thread_random_lock();  \
 +   } while (0)
 +#define _RANDOM_UNLOCK()   do {\
 +   if (__isthreaded)   \
 +   _thread_random_unlock();\
 +   } while (0)
 +
  #endif /* _THREAD_PRIVATE_H_ */
 Index: lib/libc/stdlib/random.c
 ===
 RCS file: /cvs/src/lib/libc/stdlib/random.c,v
 retrieving revision 1.17
 diff -u -p -r1.17 random.c
 --- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 -   1.17
 +++ lib/libc/stdlib/random.c21 Sep 2012 07:59:35 -
 @@ -35,6 +35,7 @@
  #include stdio.h
  #include stdlib.h
  #include unistd.h
 +#include thread_private.h

  /*
   * random.c:
 @@ -376,21 +377,38 @@ setstate(char *arg_state)
   *
   * Returns a 31-bit random number.
   */
 -long
 -random(void)
 +static long
 +random_unlocked(void)
  {
 int32_t i;
 +   int32_t *f, *r;

 if (rand_type == TYPE_0)
 i = state[0] = (state[0] * 1103515245 + 12345) 
 0x7fff;
 else {
 -   *fptr += *rptr;
 -   i = (*fptr  1)  0x7fff;  /* chucking least random
 bit */
 -   if (++fptr = end_ptr) {
 -   fptr = state;
 -   ++rptr;
 -   } else if (++rptr = end_ptr)
 -   rptr = state;
 +   /*
 +* Use local variables rather than static variables for
 speed.
 +*/
 +   f = fptr; r = rptr;
 +   *f += *r;
 +   i = (*f  1)  0x7fff; /* chucking least random
 bit */
 +   if (++f = end_ptr) {
 +   f = state;
 +   ++r;
 +   } else if (++r = end_ptr)
 +   r = state;
 +   fptr = f; rptr = r;
 }
 return((long)i);
 +}
 +
 +long
 +random(void)
 +{
 +   long r;
 +
 +   _RANDOM_LOCK();
 +   r = random_unlocked();
 +   _RANDOM_UNLOCK();
 +   return (r);
  }
 Index: lib/libc/thread/unithread_malloc_lock.c
 ===
 RCS file: /cvs/src/lib/libc/thread/unithread_malloc_lock.c,v
 retrieving revision 1.8
 diff -u -p -r1.8 unithread_malloc_lock.c
 --- lib/libc/thread/unithread_malloc_lock.c 13 Jun 2008 21:18:43 -
  1.8
 +++ lib/libc/thread/unithread_malloc_lock.c 21 Sep 2012 07:59:35 -
 @@ -21,6 +21,12 @@ WEAK_PROTOTYPE(_thread_arc4_unlock);
  WEAK_ALIAS(_thread_arc4_lock);
  WEAK_ALIAS(_thread_arc4_unlock);

 +WEAK_PROTOTYPE(_thread_random_lock);
 +WEAK_PROTOTYPE(_thread_random_unlock);
 +
 +WEAK_ALIAS(_thread_random_lock);
 +WEAK_ALIAS

Re: Threads related SIGSEGV in random.c (diff, v2)

2012-09-26 Thread Alexey Suslikov
On Wed, Sep 26, 2012 at 9:51 PM, Ted Unangst t...@tedunangst.com wrote:
 On Wed, Sep 26, 2012 at 11:18, Alexey Suslikov wrote:
 Hi.

 Any news on that?

 Can we do it without the local variables for speed part?  I am not
 interested in making this function faster.


Removing only local variables part reverts us to previous
behavior (i.e. crashes).

However, leaving current code as is but adding only local
variables (see below) passes our test with no crashes.

I'm starting to believe that static globals are not good.

Can somebody help us with writing threaded test case?

As I mentioned above, we use Kannel port as a test which
is somewhat hard to share.

Alexey

Index: lib/libc/stdlib/random.c
===
RCS file: /cvs/src/lib/libc/stdlib/random.c,v
retrieving revision 1.17
diff -u -p -r1.17 random.c
--- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 -   1.17
+++ lib/libc/stdlib/random.c26 Sep 2012 20:30:46 -
@@ -380,17 +380,20 @@ long
 random(void)
 {
int32_t i;
+   int32_t *f, *r;

if (rand_type == TYPE_0)
i = state[0] = (state[0] * 1103515245 + 12345)  0x7fff;
else {
-   *fptr += *rptr;
-   i = (*fptr  1)  0x7fff;  /* chucking least random bit */
-   if (++fptr = end_ptr) {
-   fptr = state;
-   ++rptr;
-   } else if (++rptr = end_ptr)
-   rptr = state;
+   f = fptr; r = rptr;
+   *f += *r;
+   i = (*f  1)  0x7fff; /* chucking least random bit */
+   if (++f = end_ptr) {
+   f = state;
+   ++r;
+   } else if (++r = end_ptr)
+   r = state;
+   fptr = f; rptr = r;
}
return((long)i);
 }



Re: Threads related SIGSEGV in random.c

2012-09-21 Thread Alexey Suslikov
On Wed, Sep 19, 2012 at 10:24 PM, Ted Unangst t...@tedunangst.com wrote:
 On Wed, Sep 19, 2012 at 18:50, Alexey Suslikov wrote:
 On Wednesday, September 19, 2012, Theo de Raadt wrote:

  arc4random() is also thread-safe (it has interal locking) and very
  desirable for other reasons. But no way to save state.

 The last part of this is intentional.  Saving the state of pseudo
 random number generators is a stupid concept from the 80's.


 I see many rng functions behaving very differently. Is it a good idea
 to create a common locking layer on top of need-to-be-safe rng
 functions? Or we should deal only with original problem (and only
 port random.c code from netbsd)?

 just slap a mutex around it.

With the diff below Kannel no longer crashes. Only protecting random()
for now.

Make random() thread-safe by surrounding real call with a mutex locking.
Found by and diff from Roman Kravchuk. Mainly from NetBSD.

Index: include/thread_private.h
===
RCS file: /cvs/src/lib/libc/include/thread_private.h,v
retrieving revision 1.25
diff -u -p -r1.25 thread_private.h
--- include/thread_private.h16 Oct 2011 06:29:56 -  1.25
+++ include/thread_private.h20 Sep 2012 22:10:49 -
@@ -172,4 +172,16 @@ void   _thread_arc4_unlock(void);
_thread_arc4_unlock();\
} while (0)

+void   _thread_random_lock(void);
+void   _thread_random_unlock(void);
+
+#define _RANDOM_LOCK() do {\
+   if (__isthreaded)   \
+   _thread_random_lock();  \
+   } while (0)
+#define _RANDOM_UNLOCK()   do {\
+   if (__isthreaded)   \
+   _thread_random_unlock();\
+   } while (0)
+
 #endif /* _THREAD_PRIVATE_H_ */
Index: stdlib/random.c
===
RCS file: /cvs/src/lib/libc/stdlib/random.c,v
retrieving revision 1.17
diff -u -p -r1.17 random.c
--- stdlib/random.c 1 Jun 2012 01:01:57 -   1.17
+++ stdlib/random.c 20 Sep 2012 22:10:50 -
@@ -35,6 +35,7 @@
 #include stdio.h
 #include stdlib.h
 #include unistd.h
+#include thread_private.h

 /*
  * random.c:
@@ -376,21 +377,38 @@ setstate(char *arg_state)
  *
  * Returns a 31-bit random number.
  */
-long
-random(void)
+static long
+random_unlocked(void)
 {
int32_t i;
+   int32_t *f, *r;

if (rand_type == TYPE_0)
i = state[0] = (state[0] * 1103515245 + 12345)  0x7fff;
else {
-   *fptr += *rptr;
-   i = (*fptr  1)  0x7fff;  /* chucking least random bit */
-   if (++fptr = end_ptr) {
-   fptr = state;
-   ++rptr;
-   } else if (++rptr = end_ptr)
-   rptr = state;
+   /*
+* Use local variables rather than static variables for speed.
+*/
+   f = fptr; r = rptr;
+   *f += *r;
+   i = (*f  1)  0x7fff; /* chucking least random bit */
+   if (++f = end_ptr) {
+   f = state;
+   ++r;
+   } else if (++r = end_ptr)
+   r = state;
+   fptr = f; rptr = r;
}
return((long)i);
+}
+
+long
+random(void)
+{
+   long r;
+
+   _RANDOM_LOCK();
+   r = random_unlocked();
+   _RANDOM_UNLOCK();
+   return (r);
 }
Index: thread/unithread_malloc_lock.c
===
RCS file: /cvs/src/lib/libc/thread/unithread_malloc_lock.c,v
retrieving revision 1.8
diff -u -p -r1.8 unithread_malloc_lock.c
--- thread/unithread_malloc_lock.c  13 Jun 2008 21:18:43 -  1.8
+++ thread/unithread_malloc_lock.c  20 Sep 2012 22:10:50 -
@@ -21,6 +21,12 @@ WEAK_PROTOTYPE(_thread_arc4_unlock);
 WEAK_ALIAS(_thread_arc4_lock);
 WEAK_ALIAS(_thread_arc4_unlock);

+WEAK_PROTOTYPE(_thread_random_lock);
+WEAK_PROTOTYPE(_thread_random_unlock);
+
+WEAK_ALIAS(_thread_random_lock);
+WEAK_ALIAS(_thread_random_unlock);
+
 void
 WEAK_NAME(_thread_malloc_lock)(void)
 {
@@ -53,6 +59,18 @@ WEAK_NAME(_thread_arc4_lock)(void)

 void
 WEAK_NAME(_thread_arc4_unlock)(void)
+{
+   return;
+}
+
+void
+WEAK_NAME(_thread_random_lock)(void)
+{
+   return;
+}
+
+void
+WEAK_NAME(_thread_random_unlock)(void)
 {
return;
 }



Threads related SIGSEGV in random.c (diff, v2)

2012-09-21 Thread Alexey Suslikov
On Fri, Sep 21, 2012 at 10:36 AM, Alexey Suslikov
alexey.susli...@gmail.com wrote:
 On Wed, Sep 19, 2012 at 10:24 PM, Ted Unangst t...@tedunangst.com wrote:
 On Wed, Sep 19, 2012 at 18:50, Alexey Suslikov wrote:
 On Wednesday, September 19, 2012, Theo de Raadt wrote:

  arc4random() is also thread-safe (it has interal locking) and very
  desirable for other reasons. But no way to save state.

 The last part of this is intentional.  Saving the state of pseudo
 random number generators is a stupid concept from the 80's.


 I see many rng functions behaving very differently. Is it a good idea
 to create a common locking layer on top of need-to-be-safe rng
 functions? Or we should deal only with original problem (and only
 port random.c code from netbsd)?

 just slap a mutex around it.

 With the diff below Kannel no longer crashes. Only protecting random()
 for now.

 Make random() thread-safe by surrounding real call with a mutex locking.
 Found by and diff from Roman Kravchuk. Mainly from NetBSD.

Sorry. Here is correct diff.

We kinda unsure about the approach. For now, we follow arc4random pattern.
Should we use generic _thread_mutex_lock/_thread_mutex_unlock instead?

Index: lib/libc/include/thread_private.h
===
RCS file: /cvs/src/lib/libc/include/thread_private.h,v
retrieving revision 1.25
diff -u -p -r1.25 thread_private.h
--- lib/libc/include/thread_private.h   16 Oct 2011 06:29:56 -  1.25
+++ lib/libc/include/thread_private.h   21 Sep 2012 07:59:34 -
@@ -172,4 +172,16 @@ void   _thread_arc4_unlock(void);
_thread_arc4_unlock();\
} while (0)

+void   _thread_random_lock(void);
+void   _thread_random_unlock(void);
+
+#define _RANDOM_LOCK() do {\
+   if (__isthreaded)   \
+   _thread_random_lock();  \
+   } while (0)
+#define _RANDOM_UNLOCK()   do {\
+   if (__isthreaded)   \
+   _thread_random_unlock();\
+   } while (0)
+
 #endif /* _THREAD_PRIVATE_H_ */
Index: lib/libc/stdlib/random.c
===
RCS file: /cvs/src/lib/libc/stdlib/random.c,v
retrieving revision 1.17
diff -u -p -r1.17 random.c
--- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 -   1.17
+++ lib/libc/stdlib/random.c21 Sep 2012 07:59:35 -
@@ -35,6 +35,7 @@
 #include stdio.h
 #include stdlib.h
 #include unistd.h
+#include thread_private.h

 /*
  * random.c:
@@ -376,21 +377,38 @@ setstate(char *arg_state)
  *
  * Returns a 31-bit random number.
  */
-long
-random(void)
+static long
+random_unlocked(void)
 {
int32_t i;
+   int32_t *f, *r;

if (rand_type == TYPE_0)
i = state[0] = (state[0] * 1103515245 + 12345)  0x7fff;
else {
-   *fptr += *rptr;
-   i = (*fptr  1)  0x7fff;  /* chucking least random bit */
-   if (++fptr = end_ptr) {
-   fptr = state;
-   ++rptr;
-   } else if (++rptr = end_ptr)
-   rptr = state;
+   /*
+* Use local variables rather than static variables for speed.
+*/
+   f = fptr; r = rptr;
+   *f += *r;
+   i = (*f  1)  0x7fff; /* chucking least random bit */
+   if (++f = end_ptr) {
+   f = state;
+   ++r;
+   } else if (++r = end_ptr)
+   r = state;
+   fptr = f; rptr = r;
}
return((long)i);
+}
+
+long
+random(void)
+{
+   long r;
+
+   _RANDOM_LOCK();
+   r = random_unlocked();
+   _RANDOM_UNLOCK();
+   return (r);
 }
Index: lib/libc/thread/unithread_malloc_lock.c
===
RCS file: /cvs/src/lib/libc/thread/unithread_malloc_lock.c,v
retrieving revision 1.8
diff -u -p -r1.8 unithread_malloc_lock.c
--- lib/libc/thread/unithread_malloc_lock.c 13 Jun 2008 21:18:43 -  
1.8
+++ lib/libc/thread/unithread_malloc_lock.c 21 Sep 2012 07:59:35 -
@@ -21,6 +21,12 @@ WEAK_PROTOTYPE(_thread_arc4_unlock);
 WEAK_ALIAS(_thread_arc4_lock);
 WEAK_ALIAS(_thread_arc4_unlock);

+WEAK_PROTOTYPE(_thread_random_lock);
+WEAK_PROTOTYPE(_thread_random_unlock);
+
+WEAK_ALIAS(_thread_random_lock);
+WEAK_ALIAS(_thread_random_unlock);
+
 void
 WEAK_NAME(_thread_malloc_lock)(void)
 {
@@ -53,6 +59,18 @@ WEAK_NAME(_thread_arc4_lock)(void)

 void
 WEAK_NAME(_thread_arc4_unlock)(void)
+{
+   return;
+}
+
+void
+WEAK_NAME(_thread_random_lock)(void

Re: Threads related SIGSEGV in random.c

2012-09-19 Thread Alexey Suslikov
On Wednesday, September 19, 2012, Theo de Raadt wrote:

  arc4random() is also thread-safe (it has interal locking) and very
  desirable for other reasons. But no way to save state.

 The last part of this is intentional.  Saving the state of pseudo
 random number generators is a stupid concept from the 80's.


I see many rng functions behaving very differently. Is it a good idea
to create a common locking layer on top of need-to-be-safe rng
functions? Or we should deal only with original problem (and only
port random.c code from netbsd)?



Re: Threads related SIGSEGV in random.c

2012-09-19 Thread Alexey Suslikov
On Wed, Sep 19, 2012 at 10:24 PM, Ted Unangst t...@tedunangst.com wrote:
 On Wed, Sep 19, 2012 at 18:50, Alexey Suslikov wrote:
 On Wednesday, September 19, 2012, Theo de Raadt wrote:

  arc4random() is also thread-safe (it has interal locking) and very
  desirable for other reasons. But no way to save state.

 The last part of this is intentional.  Saving the state of pseudo
 random number generators is a stupid concept from the 80's.


 I see many rng functions behaving very differently. Is it a good idea
 to create a common locking layer on top of need-to-be-safe rng
 functions? Or we should deal only with original problem (and only
 port random.c code from netbsd)?

 just slap a mutex around it.

Could you guide me how to rebuild/reinstall libc in a proper way?



  1   2   >