from:"Julian Elischer"

Re: Light GeoIP support dropped?

2020-01-06 Thread Julian Elischer


On 1/6/20 6:04 PM, Kevin Oberman wrote:

On Mon, Jan 6, 2020 at 1:17 PM Alexander Koeppe  wrote:


Hi,

since I've upgraded to FreeBSD 12, I don't find a package providing the
lightweight geoip database API incl. GeoIP.h and libGeoIP.so.

I only find `geoipupdate` which is the non-free variant of the API.

Has the package been renamed?

Thanks

- Alex


GeoIP and the GeoIP 1 database were discontinued early last year. They were
replaced by net/libmaxminddb and GeoIP 2 database. I have no idea if any
form of free data is available.


there is a partial alternative in ports...

https://www.freshports.org/search.php?query=ipdbtools=go=10=name=match=excludedeleted=1=caseinsensitive

It uses the official national registrations for country enumeration, 
and can generate firewall tables directly.


Here's the cron script I use to generate a table in ipfw that only 
allows australian and US addresses (for example):



#!/bin/sh
ALLOWFILE=/root/AU+USA-GEOIPS.ipfw
MAILTABLE=20
ALT_MAILTABLE=21
AU_VAL=1
US_VAL=10200

#fetch latest geo-ip ranges and set AU and USA into table ${MAILTABLE}
ipdb-update.sh
ipup -t AU=${AU_VAL}:US=${US_VAL} -n ${ALT_MAILTABLE} > ${ALLOWFILE}
ipfw table ${ALT_MAILTABLE} flush
ipfw -q -f ${ALLOWFILE}
ipfw table ${MAILTABLE} swap ${ALT_MAILTABLE}


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: UEFI ISO boot not working in 12.1 ?

2019-11-06 Thread Julian Elischer


On 11/6/19 4:04 PM, George Michaelson wrote:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=239876 is relevant maybe?


I suspect a separate bug because the OP specified that it worked in 
12.0 where those bugs go back to 9.x


Julian.



On Thu, Nov 7, 2019 at 9:46 AM Julian Elischer  wrote:

On 11/6/19 2:53 PM, Warner Losh wrote:

On Wed, Nov 6, 2019 at 2:03 PM Chris Ross  wrote:


On Wed, Nov 06, 2019 at 02:17:11PM -0500, Chris Ross wrote:

Hi there.  I tried booting FreeBSD-12.1-RELEASE-amd64-disc1.iso on a

[...]

I need to do?  How has 12.1 changed w.r.t. 12.0 for UEFI?

More information.  A stable/12 ISO that I built fails in the same way the
12.1-RELEASE ISO did.  But, I just grabbed releng/12.0, and built a release
ISO, and it boots.  So, something seems definately to have changed in the
way
the UEFI bits are on the boot ISOs?  Or maybe a change in the loader?
Is okay in releng/12.0, but broken in 12.1-RELEASE and stable/12.

Let me know what to try next.


You could try some bisection back along the  12 branch..


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: UEFI ISO boot not working in 12.1 ?

2019-11-06 Thread Julian Elischer


On 11/6/19 2:53 PM, Warner Losh wrote:

On Wed, Nov 6, 2019 at 2:03 PM Chris Ross  wrote:


On Wed, Nov 06, 2019 at 02:17:11PM -0500, Chris Ross wrote:

Hi there.  I tried booting FreeBSD-12.1-RELEASE-amd64-disc1.iso on a

[...]

I need to do?  How has 12.1 changed w.r.t. 12.0 for UEFI?

More information.  A stable/12 ISO that I built fails in the same way the
12.1-RELEASE ISO did.  But, I just grabbed releng/12.0, and built a release
ISO, and it boots.  So, something seems definately to have changed in the
way
the UEFI bits are on the boot ISOs?  Or maybe a change in the loader?
Is okay in releng/12.0, but broken in 12.1-RELEASE and stable/12.

Let me know what to try next.


You could try some bisection back along the  12 branch..


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Random system lockups with 12.1-STABLE r354241 amd64

2019-11-06 Thread Julian Elischer


On 11/6/19 2:58 PM, James Wright wrote:

Hi,

[...]

  Can anyone offer some advice as to how I can track down this issue?
The first question which I couldn't see from your dmesg is "do you 
have ht krnel debugger configured into your kernel?"


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Spectre/Meltdown mitigation in 11.1-p10 bogging down zfs send/receive?

2018-05-14 Thread Julian Elischer


On 14/5/18 11:48 pm, Patrick M. Hausen wrote:

Hi!


Am 14.05.2018 um 17:35 schrieb Patrick M. Hausen :
Possibly we are on the wrong track altogether.

We were - please just forget it ...


Isn't it a fact that you will always discover your own problem 
immediately after posting for help from the entire world?

I'm sure it is a corollary to Murphy's law.


ZFS scrub running during our activity ... everybody who already put
more than five minutes of thought into this deserves a beer at the next
EuroBSDCon ;-)

Patrick



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: DDD hangs on start on 11.1-R

2018-05-03 Thread Julian Elischer


On 7/3/18 3:19 am, John Baldwin wrote:

On Monday, March 05, 2018 08:19:24 AM Daniel Eischen wrote:

On Mon, 5 Mar 2018, Trond Endrest�l wrote:


On Sat, 3 Mar 2018 18:09+0100, Holm Tiffe wrote:


can anyone get ddd get to work in 11.1-R or stable?

I've more or less given up on devel/ddd, since it relies on the old
pty subsystem, now replaced by the new pts subsystem, to communicate
with gdb.

I build custom kernels containing "device pty", but I'm not sure if
that directive is being honoured these days.

It's a shame, 'cos ddd is very good at visualizing data structures.
Maybe it's possible to patch ddd to use pts instead of pty.

I used to like ddd also.  You might try devel/gps.  It's more
than just a debugger, but you can use it just for debugging.
Note, it's been a while since I've used it, but worked similarly
to ddd.

I patched ddd to use pts (was a short patch) but it still hangs for me
with both old and new gdb.  I think it is unfortunately abandonware. :(


what a pitty. it was really nice to use (if you had a big screen)


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS: Can't find pool by guid

2018-04-29 Thread Julian Elischer


On 28/4/18 8:46 pm, Willem Jan Withagen wrote:

Hi,

I upgraded a server from 10.4 to 11.1 and now al of a sudden the 
server complains about:

ZFS: Can't find pool by guid
And I end up in the boot prompt:

lsdev gives disk0 withe on p1 the partion that the zroot is/was.

This is an active server, so redoing install and stuf is nog going 
to be real workable


So how do I get this to boot?

Thanx,
--WjW
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 
"freebsd-stable-unsubscr...@freebsd.org"



did you try

  zpool import -d /dev

(when booted from memstick)

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: kern.sched.quantum: Creepy, sadistic scheduler

2018-04-08 Thread Julian Elischer


On 7/4/18 10:21 pm, Peter wrote:

Julian Elischer wrote:
for a single CPU you really should compile a kernel with SMP turned 
off

and 4BSD scheduler.

ULE is just trying too hard to do stuff you don't need.


Julian,

if we agree on this, I am fine.
(This implies that SCHED_4BSD will *not* be retired for an 
indefinite time!)


There is no reason to retire it.
We implemented a scheduler interface that both schedulers stick to.



I tested yesterday, and SCHED_4BSD doesn't show the annoying behaviour.
SMP seems to be no problem (and I need that), but PREEMPTION is 
definitely related to the problem (see my other message sent now).


P.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 
"freebsd-stable-unsubscr...@freebsd.org"




___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: kern.sched.quantum: Creepy, sadistic scheduler

2018-04-07 Thread Julian Elischer


On 4/4/18 9:32 pm, George Mitchell wrote:

On 04/04/18 06:39, Alban Hertroys wrote:

[...]
That said, SCHED_ULE (the default scheduler for quite a while now) was designed 
with multi-CPU configurations in mind and there are claims that SCHED_4BSD 
works better for single-CPU configurations. You may give that a try, if you're 
not already on SCHED_4BSD.
[...]

A small, disgruntled community of FreeBSD users who have never seen
proof that SCHED_ULE is better than SCHED_4BSD in any environment
continue to regularly recompile with SCHED_4BSD.  I dread the day when
that becomes impossible, but at least it isn't here yet.  -- George

for a single CPU you really should compile a kernel with SMP turned 
off and 4BSD scheduler.


ULE is just trying too hard to do stuff you don't need.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.1 running on HyperV hn interface hangs

2017-09-06 Thread Julian Elischer


On 6/9/17 7:02 pm, Pete French wrote:
We recently moved our software from 11.0-p9 to 11.1-p1, but looks 
like there
is a regression in 11.1-p1 running on HyperV (Windows/HyperV 2012 
R2) where

the virtual hn0 interface hangs with the following kernel messages:

  hn0:  on vmbus0
  hn0: Ethernet address: 00:15:5d:31:21:0f
  hn0: link state changed to UP
  ...
  hn0: RXBUF ack retry
  hn0: RXBUF ack failed
  last message repeated 571 times

It requires a restart of the HyperV VM.

This is a customer production server (remote customer ~4000km away) 
running
fairly critical monitoring software, so we needed to roll it back 
to 11.0-p9.
We only have two customers running our software in HyperV, vs lots 
in VMware

and a handful on physical hardware.

11.0-p9 has been very stable.  Has anyone seen this problem before 
with 11.1 ?



I don't run anything on local hyper-v anymore, but I do run a ot of 
stuff in Azure, and we havent seen anything like this. I track 
STABLE for things though, updating after reading the commits and 
testing locally for a week or so, so the version I am running 
currently is r320175, which was part of 11.1-BETA2. I am going to 
upgrade to a more recent STABLE sometime this weke or next though, 
will do that on a test amchine and let you now how it goes.


I seem to recall that there were some large changes to the hn code 
in August to add virtual function support. When does 11.1-p1 date 
from ?
make sure you contact the FreeBSD/Microsoft guys.  Very responsive.. 
don't know if they watch -stable..

I'll cc a couple..



-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 
"freebsd-stable-unsubscr...@freebsd.org"




___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD 10.4-BETA1 Now Available [ AUSTRALIAN Mirror ]

2017-08-21 Thread Julian Elischer


On 22/8/17 2:54 am, Julian Elischer wrote:

On 22/8/17 2:13 am, Ian Smith wrote:

Speaking of mirrors, just for Antipodeans:

  > https://download.freebsd.org/ftp/releases/ISO-IMAGES/10.4/

is very slow (~100KB/s) from here tonight - must be your eclipse? - 
but


ftp://ftp3.au.freebsd.org/pub/FreeBSD/releases/ISO-IMAGES/10.4/

delivered both a dvd1.iso and memstick.img at >1MB/s on ADSL.  ftp.au
was refusing connections, and ftp2.au has only up to 10.3 currently.

cheers, Ian
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 
"freebsd-stable-unsubscr...@freebsd.org"



or if you are on iinet/internode (TPG?)

there is http://ftp.iinet.net.au/pub/FreeBSD/releases/ISO-IMAGES/10.4/

with the advantage of being unmetered if you are on inet/internode

but you can still get it even from outside.

we should maybe ask them if we can add them to the mirrors list


oh also:
http://mirror.internode.on.net/pub/FreeBSD/
both internode and iinet mirrors are externally fetchable as a general 
service and are up to date and fast.






___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 
"freebsd-stable-unsubscr...@freebsd.org"




___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD 10.4-BETA1 Now Available [ AUSTRALIAN Mirror ]

2017-08-21 Thread Julian Elischer


On 22/8/17 2:13 am, Ian Smith wrote:

Speaking of mirrors, just for Antipodeans:

  > https://download.freebsd.org/ftp/releases/ISO-IMAGES/10.4/

is very slow (~100KB/s) from here tonight - must be your eclipse? - but

  ftp://ftp3.au.freebsd.org/pub/FreeBSD/releases/ISO-IMAGES/10.4/

delivered both a dvd1.iso and memstick.img at >1MB/s on ADSL.  ftp.au
was refusing connections, and ftp2.au has only up to 10.3 currently.

cheers, Ian
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


or if you are on iinet/internode (TPG?)

there is http://ftp.iinet.net.au/pub/FreeBSD/releases/ISO-IMAGES/10.4/

with the advantage of being unmetered if you are on inet/internode

but you can still get it even from outside.

we should maybe ask them if we can add them to the mirrors list



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: [ports] r438901 causes PACKAGES= issues

2017-05-22 Thread Julian Elischer


On 22/5/17 3:04 pm, Harry Schmalzbauer wrote:

  Bezüglich Harry Schmalzbauer's Nachricht vom 21.05.2017 20:25 (localtime):

  Mk still tells:
# PACKAGES  - A top level directory where all packages go
(rather than
# going locally to each port).
# Default: ${PORTSDIR}/packages

Since r438901 (
https://svnweb.freebsd.org/ports?view=revision=date=438901
)

Actually, r438058 broke PACKAGES. For the records, see
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218827

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


has this been unbroken?   We use this feature but are not on the head 
of the tree yet..


not looking forward to moving up and having everything break..

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: GCC + FreeBSD 11.0 Stable - stat.h does not have vm_ooffset_t definition

2017-05-01 Thread Julian Elischer


On 1/5/17 5:53 pm, Gerald Pfeifer wrote:

On Mon, 1 May 2017, Mark Millard wrote:

and that mkheaders does more than just fixinc.sh
as far as changing headers goes, such as limits.h
and gsyslmits.h and syslimits.h .

That's a good point, and I guess the *limits.h files do make
sense to come from the compiler itself?


The fixincludes script is known to occasionally erroneously attempt
to "fix" the system headers installed so far. As the headers up to
this point are known to not require fixing, issue the following
command to prevent the fixincludes script from running:

sed -i 's@\./fixinc\.sh@-c true@' gcc/Makefile.in

(End quote)

:

This still leaves the limits.h and gsystemlimits.h and
syslimits.h code in place but does block most of the
activity.

Thanks for this pointer, Mark!  I have earmarked this as the first
approach to give a try soon, instead of completely yanking the
fixincluded directory.
since you know the output of the change can you not make the execution 
dependent on some marker in the files you can test?


Gerald
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: moutnroot failing on zpools in Azure after upgrade from 10 to 11 due to lack of waiting for da0

2017-04-09 Thread Julian Elischer


On 8/4/17 7:01 pm, Edward Tomasz Napierała wrote:

On 0313T1206, Pete French wrote:

I have a number of machines in Azure, all booting from ZFS and, until
the weekend, running 10.3 perfectly happily.

I started upgrading these to 11. The first went fine, the second would
not boot. Looking at the boot diagnistics it is having problems finding the
root pool to mount. I see this is the diagnostic output:

storvsc0:  on vmbus0
Solaris: NOTICE: Cannot find the pool label for 'rpool'
Mounting from zfs:rpool/ROOT/default failed with error 5.
Root mount waiting for: storvsc
(probe0:blkvsc0:0:storvsc1: 0:0):  on 
vmbus0
storvsc scsi_status = 2
(da0:blkvsc0:0:0:0): UNMAPPED
(probe1:blkvsc1:0:1:0): storvsc scsi_status = 2
hvheartbeat0:  on vmbus0
da0 at blkvsc0 bus 0 scbus2 target 0 lun 0

As you can see, the drive da0 only appears after it has tried, and failed,
to mount the root pool.

Does the same problem still happen with recent 11-STABLE?


There is a fix for this floating around,  we applied at work.
 Our systems are 10.3, but I think it wouldn't be  a bad thing to add 
generally
as it could (if we let it) solve the problem we sometimes see with nfs 
as well

as with azure.

p4 diff2 -du 
//depot/bugatti/FreeBSD-PZ/10.3/sys/kern/vfs_mountroot.c#1 
//depot/bugatti/FreeBSD-PZ/10.3/sys/kern/vfs_mountroot.c#3
 //depot/bugatti/FreeBSD-PZ/10.3/sys/kern/vfs_mountroot.c#1 (text) 
- //depot/bugatti/FreeBSD-PZ/10.3/sys/kern/vfs_mountroot.c#3 (text) 
 content

@@ -126,8 +126,8 @@
 static int root_mount_mddev;
 static int root_mount_complete;

-/* By default wait up to 3 seconds for devices to appear. */
-static int root_mount_timeout = 3;
+/* By default wait up to 30 seconds for devices to appear. */
+static int root_mount_timeout = 30;
 TUNABLE_INT("vfs.mountroot.timeout", _mount_timeout);

 struct root_hold_token *
@@ -690,7 +690,7 @@
 char *errmsg;
 struct mntarg *ma;
 char *dev, *fs, *opts, *tok;
-int delay, error, timeout;
+int delay, error, timeout, err_stride;

 error = parse_token(conf, );
 if (error)
@@ -727,11 +727,20 @@
 goto out;
 }

+/*
+ * For ZFS we can't simply wait for a specific device
+ * as we only know the pool name. To work around this,
+ * parse_mount() will retry the mount later on.
+ *
+ * While retrying for NFS could be implemented similarly
+ * it is currently not supported.
+ */
+delay = hz / 10;
+timeout = root_mount_timeout * hz;
+
 if (strcmp(fs, "zfs") != 0 && strstr(fs, "nfs") == NULL &&
 dev[0] != '\0' && !parse_mount_dev_present(dev)) {
 printf("mountroot: waiting for device %s ...\n", dev);
-delay = hz / 10;
-timeout = root_mount_timeout * hz;
 do {
 pause("rmdev", delay);
 timeout -= delay;
@@ -741,16 +750,34 @@
 goto out;
 }
 }
+/* Timeout keeps counting down */

-ma = NULL;
-ma = mount_arg(ma, "fstype", fs, -1);
-ma = mount_arg(ma, "fspath", "/", -1);
-ma = mount_arg(ma, "from", dev, -1);
-ma = mount_arg(ma, "errmsg", errmsg, ERRMSGL);
-ma = mount_arg(ma, "ro", NULL, 0);
-ma = parse_mountroot_options(ma, opts);
-error = kernel_mount(ma, MNT_ROOTFS);
+err_stride=0;
+do {
+ma = NULL;
+ma = mount_arg(ma, "fstype", fs, -1);
+ma = mount_arg(ma, "fspath", "/", -1);
+ma = mount_arg(ma, "from", dev, -1);
+ma = mount_arg(ma, "errmsg", errmsg, ERRMSGL);
+ma = mount_arg(ma, "ro", NULL, 0);
+ma = parse_mountroot_options(ma, opts);

+error = kernel_mount(ma, MNT_ROOTFS);
+/* UFS only does it once */
+if (strcmp(fs, "zfs") != 0)
+break;
+timeout -= delay;
+if (timeout > 0 && error) {
+if (err_stride <= 0 ) {
+printf("Mounting from %s:%s failed with error %d. "
+"%d seconds left. Retrying.\n", fs, dev, error,
+timeout / hz);
+}
+err_stride += 1;
+err_stride %= 50;
+pause("rmzfs", delay);
+}
+} while (timeout > 0 && error);
  out:
 if (error) {
 printf("Mounting from %s:%s failed with error %d",



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: moutnroot failing on zpools in Azure after upgrade from 10 to 11 due to lack of waiting for da0

2017-03-19 Thread Julian Elischer

this was a bug in 10.3 that I thought was fixed in 11.. I believe it's 
fixed in 10-stable. Maybe 11.0 missed it?



if you still see this bug look at adding the patch in bug
   https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=208882

it makes the mountroot code retry, which is all you need.

Julian


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: [IGNORE] ldd linker script /usr/lib/libc.so fail [IGNORE]

2017-01-29 Thread Julian Elischer

Tracked this down to a rogue copy of libc.so in an unexpected place 
which was being found earlier than the real one.



On 30/1/17 1:13 am, Julian Elischer wrote:

Hi

the linker script /usr/lib/libc.so fails when you are using the 
--sysroot options because it


contains absolute paths.


Does anyone know if there is a way to add the sysroot to the script?

currently teh on ein our sysroot looks like:

$ cat /usr/build/buildroot/tools/x86_FBSD1X_gcc4.2.4/usr/lib/libc.so
/* $FreeBSD$ */
GROUP ( /lib/libc.so.7 /usr/lib/libc_nonshared.a 
/usr/lib/libssp_nonshared.a )


but I'd like to do something like:

GROUP ( ${sysroot}/lib/libc.so.7 ${sysroot}/usr/lib/libc_nonshared.a 
${sysroot}/usr/lib/libssp_nonshared.a )


but don't think I can do that

from what I see below however it shouldn't be needed.

Is this a bug in our version of ld? or am I misreading it?


I quote from one such source :

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/simple-commands.html 




INPUT(file, file, …), INPUT(file file …)

   The INPUT command directs the linker to include the named files in
   the link, as though they were named on the command line.

   For example, if you always want to include subr.o any time you do
   a link, but you can't be bothered to put it on every link command
   line, then you can put INPUT (subr.o) in your linker script.

   In fact, if you like, you can list all of your input files in the
   linker script, and then invoke the linker with nothing but a -T
   option.

   In case a /sysroot prefix/ is configured, and the filename starts
   with the / character, and the script being processed was located
   inside the /sysroot prefix/, the filename will be looked for in
   the /sysroot prefix/. Otherwise, the linker will try to open the
   file in the current directory. If it is not found, the linker will
   search through the archive library search path. See the
   description of -L in Section 3.1 /Command Line Options/
<https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/invocation.html#OPTIONS>.


   If you use INPUT (-lfile), ld will transform the name to
   libfile.a, as with the command line argument -l.

   When you use the INPUT command in an implicit linker script, the
   files will be included in the link at the point at which the
   linker script file is included. This can affect archive searching.

GROUP(file, file, …), GROUP(file file …)

   The GROUP command is like INPUT, except that the named files
   should all be archives, and they are searched repeatedly until no
   new undefined references are created.

   =


___
freebsd-curr...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"





___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ldd linker script /usr/lib/libc.so fail

2017-01-29 Thread Julian Elischer

the linker script /usr/lib/libc.so fails when you are using the
--sysroot options because it

contains absolute paths.

Does anyone know if there is a way to add the sysroot to the script?

currently teh on ein our sysroot looks like:

$ cat /usr/build/buildroot/tools/x86_FBSD1X_gcc4.2.4/usr/lib/libc.so
/* $FreeBSD$ */
GROUP ( /lib/libc.so.7 /usr/lib/libc_nonshared.a
/usr/lib/libssp_nonshared.a )

but I'd like to do something like:

GROUP ( ${sysroot}/lib/libc.so.7 ${sysroot}/usr/lib/libc_nonshared.a
${sysroot}/usr/lib/libssp_nonshared.a )

but don't think I can do that

from what I see below however it shouldn't be needed.

Is this a bug in our version of ld? or am I misreading it?

I quote from one such source :

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/simple-commands.html

INPUT(file, file, …), INPUT(file file …)

The INPUT command directs the linker to include the named files in
the link, as though they were named on the command line.

For example, if you always want to include subr.o any time you do
a link, but you can't be bothered to put it on every link command
line, then you can put INPUT (subr.o) in your linker script.

In fact, if you like, you can list all of your input files in the
linker script, and then invoke the linker with nothing but a -T
option.

In case a /sysroot prefix/ is configured, and the filename starts
with the / character, and the script being processed was located
inside the /sysroot prefix/, the filename will be looked for in
the /sysroot prefix/. Otherwise, the linker will try to open the
file in the current directory. If it is not found, the linker will
search through the archive library search path. See the
description of -L in Section 3.1 /Command Line Options/

If you use INPUT (-lfile), ld will transform the name to
libfile.a, as with the command line argument -l.

When you use the INPUT command in an implicit linker script, the
files will be included in the link at the point at which the
linker script file is included. This can affect archive searching.

GROUP(file, file, …), GROUP(file file …)

The GROUP command is like INPUT, except that the named files
should all be archives, and they are searched repeatedly until no
new undefined references are created.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

problem with mpt driver. anyone seen this or similar? (10.3)

2016-11-08 Thread Julian Elischer


Does this ring any bells?
even a theory would be a big improvement.

memcpy+0xc
mpt_read_cfg_page+0xcc
mpt_cation+0x148e
xpt_action_default+0x7e
cam_periph_runccb+0x7c
passdoioctl+0x719
passioctl+0x30
devfs_ioctl_f+0x7c
kern_ioctl+0x1a8
sys_ioctl+0x11f
amd64_syscall+0x3f9
xfast_syscall+0xf7

we see a memory access fault at line 1821..

1786 int
1787 mpt_read_cfg_page(struct mpt_softc *mpt, int Action, uint32_t PageAddress,
1788   CONFIG_PAGE_HEADER *hdr, size_t len, int sleep_ok,
1789   int timeout_ms)
1790 {
1791 request_t*req;
1792 cfgparms_tparams;
1793 int   error;
1794
1795 req = mpt_get_request(mpt, sleep_ok);
1796 if (req == NULL) {
1797 mpt_prt(mpt, "mpt_read_cfg_page: Get request failed!\n");
1798 return (-1);
1799 }
1800
1801 params.Action = Action;
1802 params.PageVersion = hdr->PageVersion;
1803 params.PageLength = hdr->PageLength;
1804 params.PageNumber = hdr->PageNumber;
1805 params.PageType = hdr->PageType & MPI_CONFIG_PAGETYPE_MASK;
1806 params.PageAddress = PageAddress;
1807 error = mpt_issue_cfg_req(mpt, req, ,
1808   req->req_pbuf + MPT_RQSL(mpt),
1809   len, sleep_ok, timeout_ms);
1810 if (error != 0) {
1811 mpt_prt(mpt, "read_cfg_page(%d) timed out\n", Action);
1812 return (-1);
1813 }
1814
1815 if ((req->IOCStatus & MPI_IOCSTATUS_MASK) != 
MPI_IOCSTATUS_SUCCESS) {
1816 mpt_prt(mpt, "mpt_read_cfg_page: Config Info Status %x\n",
1817 req->IOCStatus);
1818 mpt_free_request(mpt, req);
1819 return (-1);
1820 }
1821 memcpy(hdr, ((uint8_t *)req->req_vbuf)+MPT_RQSL(mpt), len);   
<--
1822 mpt_free_request(mpt, req);
1823 return (0);
1824 }
1825
1826 int
1827 mpt_write_cfg_page(struct mpt_softc *mpt, int Action, uint32_t PageAddress,
"mpt/mpt.c" [readonly] 3146 lines --58%--

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: fix for use-after-free problem in 10.x

2016-10-09 Thread Julian Elischer


On 8/10/2016 5:36 AM, Oliver Pinter wrote:

On 10/5/16, Julian Elischer <jul...@freebsd.org> wrote:

In 11 and 12 the taskqueue code has been rewritten in this area but
under 10 this bug still occurs.

On our appliances this bug stops the system from mounting the ZFS
root, so it is quite severe.
Basically while the thread is sleeping during the ZFS mount of root
(in the while loop), another thread can free the 'task' item it is
checking in that while loop and it can be reused or filled with
'deadcode' etc., with the waiting code unaware of the change.. The fix
is to refetch the item at the end of the queue each time around the loop.
I don't really want to do the bigger change of MFCing the change in
11, as it is more extensive, though if someone else does, that's ok by
me. (If it's ABI compatible)

Any comments or suggestions?

Yes, please commit them. This patch fixes the ZFS + GELI + INVARIANTS
problem for us.
There is the FreeBSD PR about the issue:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209580


I committed a slightly better version to stable/10
should I ask for a merge to releng/10.3?






here's the fix in diff form:


[robot@porridge /usr/src]$ p4 diff -du ...
--- //depot/pbranches/jelischer/FreeBSD-PZ/10.3/sys/kern/subr_taskqueue.c
2016-09-27 09:14:59.0 -0700
+++ /usr/src/sys/kern/subr_taskqueue.c  2016-09-27 09:14:59.0 -0700
@@ -441,9 +441,10 @@

  TQ_LOCK(queue);
  task = STAILQ_LAST(>tq_queue, task, ta_link);
-   if (task != NULL)
-   while (task->ta_pending != 0)
-   TQ_SLEEP(queue, task, >tq_mutex, PWAIT, "-",
0);
+   while (task != NULL && task->ta_pending != 0) {
+   TQ_SLEEP(queue, task, >tq_mutex, PWAIT, "-", 0);
+   task = STAILQ_LAST(>tq_queue, task, ta_link);
+   }
  taskqueue_drain_running(queue);
  KASSERT(STAILQ_EMPTY(>tq_queue),
  ("taskqueue queue is not empty after draining"));

___
freebsd-hack...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: fix for use-after-free problem in 10.x [review please].

2016-10-05 Thread Julian Elischer


Please review..
https://reviews.freebsd.org/D8160
Direct fix for stable/10 as bug is not present in 11+ in this form.

Julian


On 4/10/2016 8:06 PM, Julian Elischer wrote:
In 11 and 12 the taskqueue code has been rewritten in this area but 
under 10 this bug still occurs.


On our appliances this bug stops the system from mounting the ZFS 
root, so it is quite severe.
Basically while the thread is sleeping during the ZFS mount of root 
(in the while loop), another thread can free the 'task' item it is 
checking in that while loop and it can be reused or filled with 
'deadcode' etc., with the waiting code unaware of the change.. The 
fix is to refetch the item at the end of the queue each time around 
the loop.
I don't really want to do the bigger change of MFCing the change in 
11, as it is more extensive, though if someone else does, that's ok 
by me. (If it's ABI compatible)


Any comments or suggestions?

here's the fix in diff form:


A slightly better fix is at
https://reviews.freebsd.org/D8160




[robot@porridge /usr/src]$ p4 diff -du ...
--- 
//depot/pbranches/jelischer/FreeBSD-PZ/10.3/sys/kern/subr_taskqueue.c 
2016-09-27 09:14:59.0 -0700
+++ /usr/src/sys/kern/subr_taskqueue.c  2016-09-27 
09:14:59.0 -0700

@@ -441,9 +441,10 @@

TQ_LOCK(queue);
task = STAILQ_LAST(>tq_queue, task, ta_link);
-   if (task != NULL)
-   while (task->ta_pending != 0)
-   TQ_SLEEP(queue, task, >tq_mutex, 
PWAIT, "-", 0);

+   while (task != NULL && task->ta_pending != 0) {
+   TQ_SLEEP(queue, task, >tq_mutex, PWAIT, "-", 0);
+   task = STAILQ_LAST(>tq_queue, task, ta_link);
+   }
taskqueue_drain_running(queue);
KASSERT(STAILQ_EMPTY(>tq_queue),
("taskqueue queue is not empty after draining"));



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

fix for use-after-free problem in 10.x

2016-10-04 Thread Julian Elischer

In 11 and 12 the taskqueue code has been rewritten in this area but 
under 10 this bug still occurs.


On our appliances this bug stops the system from mounting the ZFS 
root, so it is quite severe.
Basically while the thread is sleeping during the ZFS mount of root 
(in the while loop), another thread can free the 'task' item it is 
checking in that while loop and it can be reused or filled with 
'deadcode' etc., with the waiting code unaware of the change.. The fix 
is to refetch the item at the end of the queue each time around the loop.
I don't really want to do the bigger change of MFCing the change in 
11, as it is more extensive, though if someone else does, that's ok by 
me. (If it's ABI compatible)


Any comments or suggestions?

here's the fix in diff form:


[robot@porridge /usr/src]$ p4 diff -du ...
--- //depot/pbranches/jelischer/FreeBSD-PZ/10.3/sys/kern/subr_taskqueue.c   
2016-09-27 09:14:59.0 -0700
+++ /usr/src/sys/kern/subr_taskqueue.c  2016-09-27 09:14:59.0 -0700
@@ -441,9 +441,10 @@

TQ_LOCK(queue);
task = STAILQ_LAST(>tq_queue, task, ta_link);
-   if (task != NULL)
-   while (task->ta_pending != 0)
-   TQ_SLEEP(queue, task, >tq_mutex, PWAIT, "-", 0);
+   while (task != NULL && task->ta_pending != 0) {
+   TQ_SLEEP(queue, task, >tq_mutex, PWAIT, "-", 0);
+   task = STAILQ_LAST(>tq_queue, task, ta_link);
+   }
taskqueue_drain_running(queue);
KASSERT(STAILQ_EMPTY(>tq_queue),
("taskqueue queue is not empty after draining"));

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mfi driver performance too bad on LSI MegaRAID SAS 9260-8i

2016-06-20 Thread Julian Elischer


On 17/06/2016 3:16 PM, Jason Zhang wrote:

Hi,

I am working on storage service based on FreeBSD.  I look forward to a good 
result because many professional storage company use FreeBSD as its OS.  But I 
am disappointed with the Bad performance.  I tested the the performance of LSI 
MegaRAID 9260-8i and had the following bad result:

1.  Test environment:
 (1) OS:   FreeBSD 10.0 release
 (2) Memory:  16G
 (3) RAID adapter:   LSI MegaRAID 9260-8i
 (4) Disks:  9 SAS hard drives (1 rpm),  performance is expected 
for each hard drive
 (5) Test tools:   fio with  io-depth=1, thread num is 32 and block 
size is 64k or 1M
 (6)  RAID configuration:  RAID 5,   stripe size is 1M

   2.  Test result:
(1)  write performance too bad:  20Mbytes/s throughput and 200 random 
write IOPS
(2)  read performance is expected:  700Mbytes/s throughput and 1500 
random read IOPS


I tested the same hardware configuration with CentOS linux and Linux's write 
performance is 5 times better than FreeBSD.


Anyone encountered the same performance problem?  Does the mfi driver have 
performance issue or I should give up on FreeBSD?



Unfortunatley issues related to performance can often be very specific.
We use the LSI cards with great success under FreeBSD 8 in our product 
at work but it is impossible to say what is specifically wrong in your 
setup.


Some years ago I did discover that fio needed to have completely 
different arguments to get good performance under FreeBSD, so please 
check that first.


What does performance look like with a single large write stream?

Also look at the handling of interrupts (systat -vmstat) to ensure 
that interrupts are being handled correctly.
that can vary greatly from motherboard to motherboard  and bios to 
bios. (even between revisions).
Sometimes Linux will cope differently  with these issues as they have 
better support from the motherboard makers themselves.

(sometimes we cope better too).

One final thought.. make sure you have partitioned your drives and 
filesyste,s so that all the block boundaries agree and line up.
At on place I worked we found we had accidentally  partitioned all our 
drives starting 63 sectors into the drive.
That did NOT work well. :-) 8k raid stripe writes were always 2 
writes  (and sometimes a read)





张京城   Jason

赛凡信息科技（厦门）有限公司
Cyphy  Technology  (Xiamen)  Co.Ltd.
公司总部：厦门市软件园望海路55号A座901-904单元
研发总部：北京市东城区美术馆后街大取灯胡同2号
热线：4008798066
总机：0592-2936100
邮箱：jasonzh...@cyphytech.com
公司网址：Http://www.cyphytech.com


___
freebsd-performa...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscr...@freebsd.org"




___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Toggling between remote KGDB and local DDB within a debugging session

2016-04-19 Thread Julian Elischer


On 19/04/2016 8:49 PM, Aijaz Baig wrote:

Hello

I think the title says it all!! :)

I would like to know if there is indeed a way to toggle between gdb
and ddb while debugging a remote kernel. I am already at the gdb (or
rather kgdb) prompt. From here how do I switch to local ddb on the
debugged machine??
you don't .. at teh moment I think it' s a one way street, but at one 
stage you could "detach"

and it wuld switch back to ddb.. I don't think that works any more..
I've looked at making it work more than once but never got enough of 
an understanding to make it work,
I suspect that it is a case of setting the appropriate word somewhere 
to teh appropriate value.

How to find that location from gdb is the hard part.




My kernel configuration file already contains 'options
BREAK_TO_DEBUGGER' and I have BOTH GDB and DDB configured aka:
options GDB
options DDB

I tried adding 'options KDB_UNATTENDED' but that does not make any difference.

As per the developer's handbook, "Every time you type gdb, the mode
will be toggled between remote GDB and local DDB. In order to force a
next trap immediately, simply type s (step). Your hosting GDB will now
gain control over the target kernel:" Now when you type 'gdb' at the
DDB prompt, KGDB takes over remotely. On continuing at the KGDB
prompt, you arrive back at the debugged machine but it is not longer
under the control of DDB.

My question is, how do I drop to DDB from within a running machine
whose serial ports (albeit virtual ones) are remotely attached to
another machine? When remote remote KGDB is listening and I force a
panic using 'sysctl debug.kdb.enter=1', it drops into remote KGDB.
However, when it is NOT listening on the serial port, the local system
just freezes

What I want, is to enter ddb on the local machine. Do some debugging
using it; drop to remote KGDB for things that are best done using
KGDB, then switch back to local DDB when I'm done.

Is there a way to do that? If yes please do let me know



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: HPN and None options in OpenSSH

2016-01-22 Thread Julian Elischer


On 22/01/2016 10:31 PM, Dag-Erling Smørgrav wrote:

The HPN and None cipher patches have been removed from FreeBSD-CURRENT.
I intend to remove them from FreeBSD-STABLE this weekend.

The HPN patches were of limited usefulness and required a great deal of
effort to maintain in our tree.  The None cipher patch was less onerous,
but it was a terrible idea with a very small user base since it was a
compile-time option and off by default.

The HPN-related configuration variables have been marked deprecated,
while those related to the None cipher have been marked unsupported.
This means that the former will be accepted with a warning, whereas the
latter will result in an error.

Most users will not be affected by this change.  Those who are should
switch to the openssh-portable port, which still offers both patches,
with HPN enabled by default.

It is expected that FreeBSD 10.3 will ship with OpenSSH 7.1p2, with a
number of modifications intended to reduce the impact of upstream
changes on existing systems.

what is the internal window size in the new ssh?


DES


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Have I got this VIMAGE setup correct?

2015-12-22 Thread Julian Elischer


On 23/12/2015 1:05 AM, Garrett Wollman wrote:

The consensus when I asked seemed to be that VIMAGE+jail was the right
combination to give every container its own private loopback
interface, so I tried to build that.  I noticed a few things:

1) The kernel prints out a warning message at boot time that VIMAGE is
"highly experimental".  Should I be concerned about running this in
production?

CYA only

If you are not doing much that is super unusual you should be fine.


2) Stopping jails with virtual network stacks generates warnings from
UMA about memory being leaked.


I haven't any information about that.


3) It wasn't clear (or documented anywhere that I could see) how to
get the host network set up properly.  Obviously I'm not going to have
a vlan for every single jail, so it seemed like what most people were
doing was "bridge" along with a bunch of "epair" interfaces.  I ended
up with the following:

there are exapmples in /usr/share/examples/netgraph for some things..
I've never used the build in configuration stuff,, always handcoded 
it.. It's probably improved a lot since then.

network_interfaces="lo0 bridge0 bce0"
autobridge_interfaces="bridge0"
autobridge_bridge0="bce0 epair0a epair1a"
cloned_interfaces="bridge0 epair0 epair1"
ifconfig_bridge0="inet [deleted] netmask 0xff00"
ifconfig_bridge0_ipv6="inet6 [deleted] prefixlen 64 accept_rtadv"
ifconfig_bce0="up"
ifconfig_epair0a="up"
ifconfig_epair1a="up"

The net.link.bridge.inherit_mac sysctl, which is documented in
bridge(4), doesn't appear to work; I haven't yet verified that I can
create a /etc/start_if.bridge0 to set the MAC address manually without
breaking something else.  The IPv6 stack regularly prints
"in6_if2idlen: unknown link type (209)" to the console, which is
annoying, and IPv6 on the host doesn't entirely work -- it accepts
router advertisements but then gives [ENETUNREACH] trying to actually
send packets to the default gateway.  (IPv6 to the jails *does* work!)

In each of the jails I have to manually configure a MAC address using
/etc/start_if.epairNb to ensure that it's globally unique, but then
everything seems to work.

Does this match up with what other people have been doing?  Anything
I've missed?  Any patches I should pull up to make this setup more
reliable before I roll it out in production?


I haven't used it for a couple of years..
I know others are, so I'll let them pipe up.


-GAWollman
___
freebsd-...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD Quarterly Status Report - Second Quarter 2015

2015-07-27 Thread Julian Elischer


On 7/27/15 10:32 PM, Willem Jan Withagen wrote:

On 27/07/2015 16:25, Glen Barber wrote:

On Mon, Jul 27, 2015 at 04:14:54PM +0200, Willem Jan Withagen wrote:

On 27/07/2015 04:39, Benjamin Kaduk wrote:

   * Separated email services (and single-point-of-failure cases) from
 the machine that has been handling this task for over 18 years, to
 new, single-purpose service installations

Hi,

This sort of sounds like the system that a former company (IAE) donated
to Jordan when he was here in Arnhem at a FreeBSD meeting organized by
Wilco Bulte. I think it was called freefall??
There used to be pictures of the meeting online, but I can't seem to
find them.

Would be nice to know if that is the case, because then I'm really
impressed with the life time of that system...
Does anybody know if this is actually the case?


Based on what I've recently learned of the machine's history, it was
originally freefall, then became known as 'hub'.

You have any idea what is/was actual the hardware that was in the box?

If I remember correctly we gave Jordan a check for like 5000 guilders.
Which I guess would be 2500 us$ at that time. Which was not an enormous
amount of money, so even more impressive that the system lasted 18 years :)


I think it was a bit like my grandfather's axe..

A really great axe.  we replaced the handle 3 times, the head four times
and put in a couple of new wedges, but it's a great axe that one!


--WjW

___
freebsd-curr...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: EuroBSDcon 2013: Call for Proposals, Conference on September 28+29 2013

2013-04-14 Thread Julian Elischer


On 4/11/13 5:18 PM, Andre Oppermann wrote:

Excuse me for being slightly spammy but I've received feedback that we
haven't spread this information widely enough outside the inner circles
and interested people missed the announcement.

EuroBSDcon 2013: September 28-29 in Malta
=

EuroBSDcon is the European technical conference for users and 
developers
of BSD-based systems. The conference will take place Saturday and 
Sunday

28-29 September at the Hilton Conference Centre in St. Julian's, Malta
(tutorials and FreeBSD Developer Summit on preceding Thursday and 
Friday,

talks on Saturday and Sunday).  [Yes, very nice weather at that time of
year, about 26/19C sunny no rain, Social event on Saturday evening 
is going

to be a sunset beach BBQ]


The web page suggest I bring my  wife AND my spouse..  what if they 
don't know about each other?



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: flowtable usable or not

2012-03-02 Thread Julian Elischer


On 3/2/12 10:21 AM, Doug Barton wrote:

On 03/02/2012 03:44, K. Macy wrote:



not sure who wrote:

Correct. However, I'm not sure the analogy is flawed. I am, to some
degree, guilty of the same sin. I now run Ubuntu and have never had a
single problem keeping my package system up date, in stark contrast to
my experiences of slow and nightmarishly error-ridden port updates.




but I use the PBIs from pcbsd..  you REALLY don't have this problem 
with them.



Doug



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

2012-02-17 Thread Julian Elischer


On 2/16/12 11:41 PM, Julian Elischer wrote:

adding jkim as he seems to be the last person working with TSC.


On 2/16/12 6:42 PM, David Xu wrote:

On 2012/2/17 10:19, Julian Elischer wrote:

On 2/16/12 5:56 PM, David Xu wrote:

On 2012/2/17 8:42, Julian Elischer wrote:
Adding David Xu for his thoughts since he reqrote the code in 
quesiton in revision 213098


On 2/16/12 2:57 PM, Julian Elischer wrote:

On 2/16/12 1:06 PM, Julian Elischer wrote:

On 2/16/12 9:34 AM, Andriy Gapon wrote:

on 15/02/2012 23:41 Julian Elischer said the following:

The program fio (an IO test in ports) uses pthreads

the following code (from fio-2.0.3, but its in earlier code 
too)

has suddenly started misbehaving.

 clock_gettime(CLOCK_REALTIME,t);
 t.tv_sec += seconds + 10;

 pthread_mutex_lock(mutex-lock);

 while (!mutex-value  !ret) {
 mutex-waiters++;
 ret = 
pthread_cond_timedwait(mutex-cond,mutex-lock,t);

 mutex-waiters--;
 }

 if (!ret) {
 mutex-value--;
 pthread_mutex_unlock(mutex-lock);
 }


It turns out that 'ret' sometimes comes back instantly (on 
my machine) with a

value of 60 (ETIMEDOUT)
despite the fact that we set the timeout 10 seconds into the 
future.


Has anyone else seen anything like this?
(and yes the condition variable attribute have been set to 
use the REALTIME clock).

But why?

Just a hypothesis that maybe there is some issue with time 
keeping on that system.

How would that code work out for you with MONOTONIC?


Jens Axboe, (CC'd) tried both CLOCK_REALTIME and 
CLOCK_MONOTONIC, and they both had the same problem..

i.e. random early returns with ETIMEDOUT.

I think we will try move out machine forward to a newer 
-stable to see if it resolves.
Kan upgraded the machine today to today's 9.x branch tip and 
the problem still occurs.

8.x does not have this problem.

I have not got a 9-RELEASE machine to test on.. so I can not 
tell if this came in with the burst of stuff
that came in after the 9.x branch was unfrozen after the 
release of 9.0.





I am trying to reproduce the problem,  do you have complete 
sample code to test ?


I'm still looking the exact set
but on my machine (4 cpus) the program from ports sysutils/fio 
exhibits the problem when used with

kern.timecounter.hardware=TSC-low and with the following config file:

pu05 # cat config.fio

[global]
#clocksource=cpu
direct=1
rw=randread
bs=4096
fill_device=1
numjobs=16
iodepth=16
#ioengine=posixaio
#ioengine=psync
ioengine=psync
group_reporting
norandommap
time_based
runtime=6
randrepeat=0

[file1]
filename=/dev/ada0

pu05 #
pu05 # fio config.fio
fio: this platform does not support process shared mutexes, 
forcing use of threads. Use the 'thread' option to get rid of this 
warning.

file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=psync, iodepth=16
...
file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=psync, iodepth=16
fio 2.0.3
Starting 15 threads and 1 process
fio: job startup hung? exiting.
fio: 5 jobs failed to start
Segmentation fault (core dumped)
pu05#


The reason 5 jobs failed to start is because the parent timed out 
on them immediately.

It didn't time out on 10 of them apparently.


if I set the timer to ACPI-fast it works as expected..
maybe following code can check to see if TSC-LOW works by let the 
thread run

on each cpu.

gettimeofday(prev, NULL);
int cpu = 0;
for (;;) {
 cpuset_t set;
 cpu = ++cpu % 4;
 CPU_ZERO(set);
 CPU_SET(cpu, set);
 pthread_setaffinity_np(pthread_self(), sizeof(set), set);
 gettimeofday(cur, NULL);
 if ( timercmp(prev, cur, =)) {
abort();
   }
}




pu05# sysctl kern.timecounter.hardware=TSC-low
kern.timecounter.hardware: ACPI-fast - TSC-low
pu05# ./test
^C
pu05# cat test.c

#include stdlib.h
#include sys/param.h
#include sys/cpuset.h
#include pthread_np.h

#include sys/time.h

main()
{
int cpu = 0;
struct timeval prev, cur;

gettimeofday(prev, NULL);
for (;;) {
 cpuset_t set;
 cpu = ++cpu % 4;
 CPU_ZERO(set);
 CPU_SET(cpu, set);
 pthread_setaffinity_np(pthread_self(), sizeof(set), set);
 gettimeofday(cur, NULL);
 if ( timercmp(prev, cur, )) {
abort();
   }
   prev = cur;
}
}

pu05# ./test

minutes pass...

^C
pu05#

so it looks as if the TSC is working ok..
I'm just going to check that the program is actually moving CPU...
yes it is moving around but I can't tell at what speed. (according to 
top).


so we are still left with a question of where is the problem?

kernel TSC driver?
generic gettimeofday() code?
pthreads cond code?
the application?





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

2012-02-17 Thread Julian Elischer


On 2/17/12 3:28 AM, David Xu wrote:

On 2012/2/17 16:06, Julian Elischer wrote:

On 2/16/12 11:41 PM, Julian Elischer wrote:

adding jkim as he seems to be the last person working with TSC.


On 2/16/12 6:42 PM, David Xu wrote:

On 2012/2/17 10:19, Julian Elischer wrote:

On 2/16/12 5:56 PM, David Xu wrote:

On 2012/2/17 8:42, Julian Elischer wrote:
Adding David Xu for his thoughts since he reqrote the code in 
quesiton in revision 213098


On 2/16/12 2:57 PM, Julian Elischer wrote:

On 2/16/12 1:06 PM, Julian Elischer wrote:

On 2/16/12 9:34 AM, Andriy Gapon wrote:

on 15/02/2012 23:41 Julian Elischer said the following:

The program fio (an IO test in ports) uses pthreads

the following code (from fio-2.0.3, but its in earlier 
code too)

has suddenly started misbehaving.

 clock_gettime(CLOCK_REALTIME,t);
 t.tv_sec += seconds + 10;

 pthread_mutex_lock(mutex-lock);

 while (!mutex-value  !ret) {
 mutex-waiters++;
 ret = 
pthread_cond_timedwait(mutex-cond,mutex-lock,t);

 mutex-waiters--;
 }

 if (!ret) {
 mutex-value--;
 pthread_mutex_unlock(mutex-lock);
 }


It turns out that 'ret' sometimes comes back instantly (on 
my machine) with a

value of 60 (ETIMEDOUT)
despite the fact that we set the timeout 10 seconds into 
the future.


Has anyone else seen anything like this?
(and yes the condition variable attribute have been set to 
use the REALTIME clock).

But why?

Just a hypothesis that maybe there is some issue with time 
keeping on that system.

How would that code work out for you with MONOTONIC?


Jens Axboe, (CC'd) tried both CLOCK_REALTIME and 
CLOCK_MONOTONIC, and they both had the same problem..

i.e. random early returns with ETIMEDOUT.

I think we will try move out machine forward to a newer 
-stable to see if it resolves.
Kan upgraded the machine today to today's 9.x branch tip and 
the problem still occurs.

8.x does not have this problem.

I have not got a 9-RELEASE machine to test on.. so I can not 
tell if this came in with the burst of stuff
that came in after the 9.x branch was unfrozen after the 
release of 9.0.





I am trying to reproduce the problem,  do you have complete 
sample code to test ?


I'm still looking the exact set
but on my machine (4 cpus) the program from ports sysutils/fio 
exhibits the problem when used with
kern.timecounter.hardware=TSC-low and with the following config 
file:


pu05 # cat config.fio

[global]
#clocksource=cpu
direct=1
rw=randread
bs=4096
fill_device=1
numjobs=16
iodepth=16
#ioengine=posixaio
#ioengine=psync
ioengine=psync
group_reporting
norandommap
time_based
runtime=6
randrepeat=0

[file1]
filename=/dev/ada0

pu05 #
pu05 # fio config.fio
fio: this platform does not support process shared mutexes, 
forcing use of threads. Use the 'thread' option to get rid of 
this warning.
file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=psync, 
iodepth=16

...
file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=psync, 
iodepth=16

fio 2.0.3
Starting 15 threads and 1 process
fio: job startup hung? exiting.
fio: 5 jobs failed to start
Segmentation fault (core dumped)
pu05#


The reason 5 jobs failed to start is because the parent timed 
out on them immediately.

It didn't time out on 10 of them apparently.


if I set the timer to ACPI-fast it works as expected..
maybe following code can check to see if TSC-LOW works by let the 
thread run

on each cpu.

gettimeofday(prev, NULL);
int cpu = 0;
for (;;) {
 cpuset_t set;
 cpu = ++cpu % 4;
 CPU_ZERO(set);
 CPU_SET(cpu, set);
 pthread_setaffinity_np(pthread_self(), sizeof(set), set);
 gettimeofday(cur, NULL);
 if ( timercmp(prev, cur, =)) {
abort();
   }
}




pu05# sysctl kern.timecounter.hardware=TSC-low
kern.timecounter.hardware: ACPI-fast - TSC-low
pu05# ./test
^C
pu05# cat test.c

#include stdlib.h
#include sys/param.h
#include sys/cpuset.h
#include pthread_np.h

#include sys/time.h

main()
{
int cpu = 0;
struct timeval prev, cur;

gettimeofday(prev, NULL);
for (;;) {
 cpuset_t set;
 cpu = ++cpu % 4;
 CPU_ZERO(set);
 CPU_SET(cpu, set);
 pthread_setaffinity_np(pthread_self(), sizeof(set), set);
 gettimeofday(cur, NULL);
 if ( timercmp(prev, cur, )) {
abort();
   }
   prev = cur;
}
}

pu05# ./test

minutes pass...

^C
pu05#

so it looks as if the TSC is working ok..
I'm just going to check that the program is actually moving CPU...
yes it is moving around but I can't tell at what speed. (according 
to top).


so we are still left with a question of where is the problem?

kernel TSC driver?
generic gettimeofday() code?
pthreads cond code?
the application?



I am running the fio test on my notebook which is using TSC-low,
it is on 9.0-RC3, I can not reproduce the problem for
minutes, then I interrupt it with ctrl-c: looks mot

http

Re: pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

2012-02-17 Thread Julian Elischer


On Friday 17 February 2012 06:28 am, David Xu wrote:

On 2012/2/17 16:06, Julian Elischer wrote:

On 2/16/12 11:41 PM, Julian Elischer wrote:

adding jkim as he seems to be the last person working with TSC.

On 2/16/12 6:42 PM, David Xu wrote:

On 2012/2/17 10:19, Julian Elischer wrote:

On 2/16/12 5:56 PM, David Xu wrote:

On 2012/2/17 8:42, Julian Elischer wrote:

Adding David Xu for his thoughts since he reqrote the code
in quesiton in revision 213098

On 2/16/12 2:57 PM, Julian Elischer wrote:

On 2/16/12 1:06 PM, Julian Elischer wrote:

On 2/16/12 9:34 AM, Andriy Gapon wrote:

on 15/02/2012 23:41 Julian Elischer said the following:

The program fio (an IO test in ports) uses pthreads

the following code (from fio-2.0.3, but its in earlier
code too) has suddenly started misbehaving.

  clock_gettime(CLOCK_REALTIME,t);
  t.tv_sec += seconds + 10;

  pthread_mutex_lock(mutex-lock);

  while (!mutex-value   !ret) {
  mutex-waiters++;
  ret =
pthread_cond_timedwait(mutex-cond,mutex-lock,t);
  mutex-waiters--;
  }

  if (!ret) {
  mutex-value--;
  pthread_mutex_unlock(mutex-lock);
  }


It turns out that 'ret' sometimes comes back instantly
(on my machine) with a
value of 60 (ETIMEDOUT)
despite the fact that we set the timeout 10 seconds into
the future.

Has anyone else seen anything like this?
(and yes the condition variable attribute have been set
to use the REALTIME clock).

But why?

Just a hypothesis that maybe there is some issue with
time keeping on that system.
How would that code work out for you with MONOTONIC?

Jens Axboe, (CC'd) tried both CLOCK_REALTIME and
CLOCK_MONOTONIC, and they both had the same problem..
i.e. random early returns with ETIMEDOUT.

I think we will try move out machine forward to a newer
-stable to see if it resolves.

Kan upgraded the machine today to today's 9.x branch tip
and the problem still occurs.
8.x does not have this problem.

I have not got a 9-RELEASE machine to test on.. so I can
not tell if this came in with the burst of stuff
that came in after the 9.x branch was unfrozen after the
release of 9.0.

I am trying to reproduce the problem,  do you have complete
sample code to test ?

I'm still looking the exact set
but on my machine (4 cpus) the program from ports sysutils/fio
exhibits the problem when used with
kern.timecounter.hardware=TSC-low and with the following
config file:

pu05 # cat config.fio

[global]
#clocksource=cpu
direct=1
rw=randread
bs=4096
fill_device=1
numjobs=16
iodepth=16
#ioengine=posixaio
#ioengine=psync
ioengine=psync
group_reporting
norandommap
time_based
runtime=6
randrepeat=0

[file1]
filename=/dev/ada0

pu05 #
pu05 # fio config.fio
fio: this platform does not support process shared mutexes,
forcing use of threads. Use the 'thread' option to get rid of
this warning. file1: (g=0): rw=randread, bs=4K-4K/4K-4K,
ioengine=psync, iodepth=16 ...
file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=psync,
iodepth=16 fio 2.0.3
Starting 15 threads and 1 process
fio: job startup hung? exiting.
fio: 5 jobs failed to start
Segmentation fault (core dumped)
pu05#


The reason 5 jobs failed to start is because the parent timed
out on them immediately.
It didn't time out on 10 of them apparently.


if I set the timer to ACPI-fast it works as expected..

maybe following code can check to see if TSC-LOW works by let
the thread run
on each cpu.

gettimeofday(prev, NULL);
int cpu = 0;
for (;;) {
  cpuset_t set;
  cpu = ++cpu % 4;
  CPU_ZERO(set);
  CPU_SET(cpu,set);
  pthread_setaffinity_np(pthread_self(), sizeof(set),set);
  gettimeofday(cur, NULL);
  if ( timercmp(prev,cur,=)) {
 abort();
}
}

pu05# sysctl kern.timecounter.hardware=TSC-low
kern.timecounter.hardware: ACPI-fast -  TSC-low
pu05# ./test
^C
pu05# cat test.c

#includestdlib.h
#includesys/param.h
#includesys/cpuset.h
#includepthread_np.h

#includesys/time.h

main()
{
 int cpu = 0;
 struct timeval prev, cur;

 gettimeofday(prev, NULL);
 for (;;) {
  cpuset_t set;
  cpu = ++cpu % 4;
  CPU_ZERO(set);
  CPU_SET(cpu,set);
  pthread_setaffinity_np(pthread_self(), sizeof(set),
set); gettimeofday(cur, NULL);
  if ( timercmp(prev,cur,)) {
 abort();
}
prev = cur;
 }
}

pu05# ./test

minutes pass...

^C
pu05#

so it looks as if the TSC is working ok..
I'm just going to check that the program is actually moving
CPU... yes it is moving around but I can't tell at what speed.
(according to top).

so we are still left with a question of where is the problem?

kernel TSC driver?
generic gettimeofday() code?
pthreads cond code?
the application?

I am running the fio test on my notebook which is using TSC-low,
it is on 9.0-RC3, I can not reproduce the problem for
minutes, then I interrupt it with ctrl-c:

http://people.freebsd.org

something wrong with dmesg?

2012-02-16 Thread Julian Elischer

I just noticed that lately in 9.x and maybe 8-Stable, dmesg seems to 
return nothing if
there is active logging going on. I saw someone else refer to this as 
well.


Has this been reported?



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

2012-02-16 Thread Julian Elischer


On 2/16/12 9:34 AM, Andriy Gapon wrote:

on 15/02/2012 23:41 Julian Elischer said the following:

The program fio (an IO test in ports) uses pthreads

the following code (from fio-2.0.3, but its in earlier code too)
has suddenly started misbehaving.

 clock_gettime(CLOCK_REALTIME,t);
 t.tv_sec += seconds + 10;

 pthread_mutex_lock(mutex-lock);

 while (!mutex-value  !ret) {
 mutex-waiters++;
 ret = pthread_cond_timedwait(mutex-cond,mutex-lock,t);
 mutex-waiters--;
 }

 if (!ret) {
 mutex-value--;
 pthread_mutex_unlock(mutex-lock);
 }


It turns out that 'ret' sometimes comes back instantly (on my machine) with a
value of 60 (ETIMEDOUT)
despite the fact that we set the timeout 10 seconds into the future.

Has anyone else seen anything like this?
(and yes the condition variable attribute have been set to use the REALTIME 
clock).

But why?

Just a hypothesis that maybe there is some issue with time keeping on that 
system.
How would that code work out for you with MONOTONIC?


Jens Axboe, (CC'd) tried both CLOCK_REALTIME and CLOCK_MONOTONIC, and 
they both had the same problem..

i.e. random early returns with ETIMEDOUT.

I think we will try move out machine forward to a newer -stable to see 
if it resolves.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: something wrong with dmesg?

2012-02-16 Thread Julian Elischer


On 2/16/12 1:27 AM, Sergey Kandaurov wrote:

On 16 February 2012 12:58, Julian Elischerjul...@freebsd.org  wrote:

I just noticed that lately in 9.x and maybe 8-Stable, dmesg seems to return
nothing if
there is active logging going on. I saw someone else refer to this as well.

Has this been reported?

Didn't we have this for years? I cannot recall there was a difference to the
current behavior since 8.x or 7.x (or even 6.x).
kern.msgbuf includes messages sent with syslog, and if they dominate
then you might get empty dmesg output. By default it skips non-kernel
(or rather those messages which doesn't use BSD syslog format)
messages, and you can include them using dmesg -a.


I expect to see the output from all kernel 'printf' statements in dmesg.
they were definitely not turning up for me last week, but if I tried 
again,

there would be contents,

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

2012-02-16 Thread Julian Elischer


On 2/16/12 1:06 PM, Julian Elischer wrote:

On 2/16/12 9:34 AM, Andriy Gapon wrote:

on 15/02/2012 23:41 Julian Elischer said the following:

The program fio (an IO test in ports) uses pthreads

the following code (from fio-2.0.3, but its in earlier code too)
has suddenly started misbehaving.

 clock_gettime(CLOCK_REALTIME,t);
 t.tv_sec += seconds + 10;

 pthread_mutex_lock(mutex-lock);

 while (!mutex-value  !ret) {
 mutex-waiters++;
 ret = 
pthread_cond_timedwait(mutex-cond,mutex-lock,t);

 mutex-waiters--;
 }

 if (!ret) {
 mutex-value--;
 pthread_mutex_unlock(mutex-lock);
 }


It turns out that 'ret' sometimes comes back instantly (on my 
machine) with a

value of 60 (ETIMEDOUT)
despite the fact that we set the timeout 10 seconds into the future.

Has anyone else seen anything like this?
(and yes the condition variable attribute have been set to use the 
REALTIME clock).

But why?

Just a hypothesis that maybe there is some issue with time keeping 
on that system.

How would that code work out for you with MONOTONIC?


Jens Axboe, (CC'd) tried both CLOCK_REALTIME and CLOCK_MONOTONIC, 
and they both had the same problem..

i.e. random early returns with ETIMEDOUT.

I think we will try move out machine forward to a newer -stable to 
see if it resolves.
Kan upgraded the machine today to today's 9.x branch tip and the 
problem still occurs.

8.x does not have this problem.

I have not got a 9-RELEASE machine to test on.. so I can not tell if 
this came in with the burst of stuff

that came in after the 9.x branch was unfrozen after the release of 9.0.




___
freebsd-thre...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-threads
To unsubscribe, send any mail to 
freebsd-threads-unsubscr...@freebsd.org




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

2012-02-16 Thread Julian Elischer

Adding David Xu for his thoughts since he reqrote the code in quesiton 
in revision 213098


On 2/16/12 2:57 PM, Julian Elischer wrote:

On 2/16/12 1:06 PM, Julian Elischer wrote:

On 2/16/12 9:34 AM, Andriy Gapon wrote:

on 15/02/2012 23:41 Julian Elischer said the following:

The program fio (an IO test in ports) uses pthreads

the following code (from fio-2.0.3, but its in earlier code too)
has suddenly started misbehaving.

 clock_gettime(CLOCK_REALTIME,t);
 t.tv_sec += seconds + 10;

 pthread_mutex_lock(mutex-lock);

 while (!mutex-value  !ret) {
 mutex-waiters++;
 ret = 
pthread_cond_timedwait(mutex-cond,mutex-lock,t);

 mutex-waiters--;
 }

 if (!ret) {
 mutex-value--;
 pthread_mutex_unlock(mutex-lock);
 }


It turns out that 'ret' sometimes comes back instantly (on my 
machine) with a

value of 60 (ETIMEDOUT)
despite the fact that we set the timeout 10 seconds into the future.

Has anyone else seen anything like this?
(and yes the condition variable attribute have been set to use 
the REALTIME clock).

But why?

Just a hypothesis that maybe there is some issue with time keeping 
on that system.

How would that code work out for you with MONOTONIC?


Jens Axboe, (CC'd) tried both CLOCK_REALTIME and CLOCK_MONOTONIC, 
and they both had the same problem..

i.e. random early returns with ETIMEDOUT.

I think we will try move out machine forward to a newer -stable to 
see if it resolves.
Kan upgraded the machine today to today's 9.x branch tip and the 
problem still occurs.

8.x does not have this problem.

I have not got a 9-RELEASE machine to test on.. so I can not tell if 
this came in with the burst of stuff
that came in after the 9.x branch was unfrozen after the release of 
9.0.





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: pthread_cond_timedwait() broken in 9-stable? [possible answer]

2012-02-16 Thread Julian Elischer



kern.timecounter.tick: 1
kern.timecounter.choice: TSC-low(1000) i8254(0) HPET(950) 
ACPI-fast(900) dummy(-100)

kern.timecounter.hardware: ACPI-fast
kern.timecounter.stepwarnings: 0

switching the machine from TSC_low to ACPI-fast  fixes the problem.

in 8.x it used to default to ACPI
but I used to switch it to TSC to get better performance.

I wonder why TSC-low is now bad to use..
maybe the TSCs are not as well sychronised as they were in 8.x?
maybe the pthreads code didn't get the memo about changing timers?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

2012-02-16 Thread Julian Elischer


On 2/16/12 5:56 PM, David Xu wrote:

On 2012/2/17 8:42, Julian Elischer wrote:
Adding David Xu for his thoughts since he reqrote the code in 
quesiton in revision 213098


On 2/16/12 2:57 PM, Julian Elischer wrote:

On 2/16/12 1:06 PM, Julian Elischer wrote:

On 2/16/12 9:34 AM, Andriy Gapon wrote:

on 15/02/2012 23:41 Julian Elischer said the following:

The program fio (an IO test in ports) uses pthreads

the following code (from fio-2.0.3, but its in earlier code too)
has suddenly started misbehaving.

 clock_gettime(CLOCK_REALTIME,t);
 t.tv_sec += seconds + 10;

 pthread_mutex_lock(mutex-lock);

 while (!mutex-value  !ret) {
 mutex-waiters++;
 ret = 
pthread_cond_timedwait(mutex-cond,mutex-lock,t);

 mutex-waiters--;
 }

 if (!ret) {
 mutex-value--;
 pthread_mutex_unlock(mutex-lock);
 }


It turns out that 'ret' sometimes comes back instantly (on my 
machine) with a

value of 60 (ETIMEDOUT)
despite the fact that we set the timeout 10 seconds into the 
future.


Has anyone else seen anything like this?
(and yes the condition variable attribute have been set to use 
the REALTIME clock).

But why?

Just a hypothesis that maybe there is some issue with time 
keeping on that system.

How would that code work out for you with MONOTONIC?


Jens Axboe, (CC'd) tried both CLOCK_REALTIME and CLOCK_MONOTONIC, 
and they both had the same problem..

i.e. random early returns with ETIMEDOUT.

I think we will try move out machine forward to a newer -stable 
to see if it resolves.
Kan upgraded the machine today to today's 9.x branch tip and the 
problem still occurs.

8.x does not have this problem.

I have not got a 9-RELEASE machine to test on.. so I can not tell 
if this came in with the burst of stuff
that came in after the 9.x branch was unfrozen after the release 
of 9.0.





I am trying to reproduce the problem,  do you have complete sample 
code to test ?


I'm still looking the exact set
but on my machine (4 cpus) the program from ports sysutils/fio 
exhibits the problem when used with

kern.timecounter.hardware=TSC-low and with the following config file:

pu05 # cat config.fio

[global]
#clocksource=cpu
direct=1
rw=randread
bs=4096
fill_device=1
numjobs=16
iodepth=16
#ioengine=posixaio
#ioengine=psync
ioengine=psync
group_reporting
norandommap
time_based
runtime=6
randrepeat=0

[file1]
filename=/dev/ada0

pu05 #
pu05 # fio config.fio
fio: this platform does not support process shared mutexes, forcing 
use of threads. Use the 'thread' option to get rid of this warning.

file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=psync, iodepth=16
...
file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=psync, iodepth=16
fio 2.0.3
Starting 15 threads and 1 process
fio: job startup hung? exiting.
fio: 5 jobs failed to start
Segmentation fault (core dumped)
pu05#


The reason 5 jobs failed to start is because the parent timed out on 
them immediately.

It didn't time out on 10 of them apparently.


if I set the timer to ACPI-fast it works as expected..


Regards,
David Xu




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

2012-02-16 Thread Julian Elischer


adding jkim as he seems to be the last person working with TSC.


On 2/16/12 6:42 PM, David Xu wrote:

On 2012/2/17 10:19, Julian Elischer wrote:

On 2/16/12 5:56 PM, David Xu wrote:

On 2012/2/17 8:42, Julian Elischer wrote:
Adding David Xu for his thoughts since he reqrote the code in 
quesiton in revision 213098


On 2/16/12 2:57 PM, Julian Elischer wrote:

On 2/16/12 1:06 PM, Julian Elischer wrote:

On 2/16/12 9:34 AM, Andriy Gapon wrote:

on 15/02/2012 23:41 Julian Elischer said the following:

The program fio (an IO test in ports) uses pthreads

the following code (from fio-2.0.3, but its in earlier code too)
has suddenly started misbehaving.

 clock_gettime(CLOCK_REALTIME,t);
 t.tv_sec += seconds + 10;

 pthread_mutex_lock(mutex-lock);

 while (!mutex-value  !ret) {
 mutex-waiters++;
 ret = 
pthread_cond_timedwait(mutex-cond,mutex-lock,t);

 mutex-waiters--;
 }

 if (!ret) {
 mutex-value--;
 pthread_mutex_unlock(mutex-lock);
 }


It turns out that 'ret' sometimes comes back instantly (on my 
machine) with a

value of 60 (ETIMEDOUT)
despite the fact that we set the timeout 10 seconds into the 
future.


Has anyone else seen anything like this?
(and yes the condition variable attribute have been set to 
use the REALTIME clock).

But why?

Just a hypothesis that maybe there is some issue with time 
keeping on that system.

How would that code work out for you with MONOTONIC?


Jens Axboe, (CC'd) tried both CLOCK_REALTIME and 
CLOCK_MONOTONIC, and they both had the same problem..

i.e. random early returns with ETIMEDOUT.

I think we will try move out machine forward to a newer -stable 
to see if it resolves.
Kan upgraded the machine today to today's 9.x branch tip and the 
problem still occurs.

8.x does not have this problem.

I have not got a 9-RELEASE machine to test on.. so I can not 
tell if this came in with the burst of stuff
that came in after the 9.x branch was unfrozen after the release 
of 9.0.





I am trying to reproduce the problem,  do you have complete sample 
code to test ?


I'm still looking the exact set
but on my machine (4 cpus) the program from ports sysutils/fio 
exhibits the problem when used with

kern.timecounter.hardware=TSC-low and with the following config file:

pu05 # cat config.fio

[global]
#clocksource=cpu
direct=1
rw=randread
bs=4096
fill_device=1
numjobs=16
iodepth=16
#ioengine=posixaio
#ioengine=psync
ioengine=psync
group_reporting
norandommap
time_based
runtime=6
randrepeat=0

[file1]
filename=/dev/ada0

pu05 #
pu05 # fio config.fio
fio: this platform does not support process shared mutexes, forcing 
use of threads. Use the 'thread' option to get rid of this warning.

file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=psync, iodepth=16
...
file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=psync, iodepth=16
fio 2.0.3
Starting 15 threads and 1 process
fio: job startup hung? exiting.
fio: 5 jobs failed to start
Segmentation fault (core dumped)
pu05#


The reason 5 jobs failed to start is because the parent timed out 
on them immediately.

It didn't time out on 10 of them apparently.


if I set the timer to ACPI-fast it works as expected..
maybe following code can check to see if TSC-LOW works by let the 
thread run

on each cpu.

gettimeofday(prev, NULL);
int cpu = 0;
for (;;) {
 cpuset_t set;
 cpu = ++cpu % 4;
 CPU_ZERO(set);
 CPU_SET(cpu, set);
 pthread_setaffinity_np(pthread_self(), sizeof(set), set);
 gettimeofday(cur, NULL);
 if ( timercmp(prev, cur, =)) {
abort();
   }
}




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

2012-02-15 Thread Julian Elischer


The program fio (an IO test in ports) uses pthreads

the following code (from fio-2.0.3, but its in earlier code too)
has suddenly started misbehaving.

clock_gettime(CLOCK_REALTIME, t);
t.tv_sec += seconds + 10;

pthread_mutex_lock(mutex-lock);

while (!mutex-value  !ret) {
mutex-waiters++;
ret = pthread_cond_timedwait(mutex-cond, 
mutex-lock, t);

mutex-waiters--;
}

if (!ret) {
mutex-value--;
pthread_mutex_unlock(mutex-lock);
}


It turns out that 'ret' sometimes comes back instantly (on my machine) 
with a value of 60 (ETIMEDOUT)

despite the fact that we set the timeout 10 seconds into the future.

Has anyone else seen anything like this?
(and yes the condition variable attribute have been set to use the 
REALTIME clock).



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

freebsd 9-stable TOP problem from around Jan 10

2012-02-14 Thread Julian Elischer


Has anyone else seen a  problem with top -H -S?

after a short while the screen gets more and more corrupted..

hitting ^L or turning off S  H modes helps .. for a while.

If this is a known fixed problem, let me know but I need to 
co-ordinate with others

to upgrade the machine in question.

Julian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Custom kernel poll summary

2012-02-14 Thread Julian Elischer


On 2/14/12 7:43 AM, Ian Smith wrote:

On Tue, 14 Feb 2012 2:37:55 +0100, Alexander Leidinger wrote:
Here is what I got, the first column is the number of requests, the second
what is requested, and the 3rd my comments (basically it means, if there 
is a
comment, it is not needed/possible to include in a modular kernel):
---snip---

[..]

1 IPFIREWALL_FORWARD-  performance impact too big if unused 
(julian)


well it's not that big but you will be running extra code for every 
packet unless you want it.
when I made it an option but I was mainly trying to placate the just 
say no crowd.
I perswonally wouldn't  mind having it on by default in GENERIC, as 
long as we still make it an option

so people who want every last drop of cpu can remove it.

I expect Julian will object if I've mis-paraphrased or over-simplified
something I recall him saying at least a couple of years ago :)

[..]

4 ALTQ*  -  does add code to the pf module
   other impact?

ipfw(8) can also apply ALTQ tags, but relies on pfctl(8) to setup the
queues - or so I read; I've not used it here.  From altq(4):

  ALTQEnable ALTQ.
  ALTQ_CBQBuild the ``Class Based Queuing'' discipline.
  ALTQ_REDBuild the ``Random Early Detection'' extension.
  ALTQ_RIOBuild ``Random Early Drop'' for input and output.
  ALTQ_HFSC   Build the ``Hierarchical Packet Scheduler'' discipline.
  ALTQ_CDNR   Build the traffic conditioner.  This option is meaningless at
  the moment as the conditioner is not used by any of the
  available disciplines or consumers.
  ALTQ_PRIQ   Build the ``Priority Queuing'' discipline.
  ALTQ_NOPCC  Required if the TSC is unusable.
  ALTQ_DEBUG  Enable additional debugging facilities.

  Note that ALTQ-disciplines cannot be loaded as kernel modules.  In order
  to use a certain discipline you have to build it into a custom kernel.
  The pf(4) interface, that is required for the configuration process of
  ALTQ can be loaded as a module.

So which disciplines would one choose?  Seeming an unlikely candidate?

1 IPSTEALTH  -  changes ipfw module only?

I don't think this is specific to ipfw.  From /sys/conf/NOTES:

# IPSTEALTH enables code to support stealth forwarding (i.e., forwarding
# packets without touching the TTL).  This can be useful to hide firewalls
# from traceroute and similar tools.

But can it be disabled once added to kernel?  It's no good as a default.

1 IPFIREWALL_VERBOSE_LIMIT=5 -  changes ipfw module only?
   loader tunable?
1 IPFIREWALL_VERBOSE -  changes ipfw module only?
   loader tunable?

sysctl.conf: net.inet.ip.fw.verbose and net.inet.ip.fw.verbose_limit

cheers, Ian



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: freebsd 9-stable TOP problem from around Jan 10

2012-02-14 Thread Julian Elischer


On 2/14/12 10:38 AM, Kevin Oberman wrote:

On Tue, Feb 14, 2012 at 12:23 AM, Julian Elischerjul...@freebsd.org  wrote:

Has anyone else seen a  problem with top -H -S?

after a short while the screen gets more and more corrupted..

hitting ^L or turning off S  H modes helps .. for a while.

If this is a known fixed problem, let me know but I need to co-ordinate with
others
to upgrade the machine in question.

Not seeing it here on 9-stable. Could it be a display issue? I am
using gnome-terminal with TERM defined as 'xterm'.


yeah I'm on a mac with iterm, but running through 'screen' .

it's never been a problem before.. just since we upgraded to 9-stable.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: freebsd 9-stable TOP problem from around Jan 10

2012-02-14 Thread Julian Elischer


On 2/14/12 4:20 PM, Jeremy Chadwick wrote:

On Tue, Feb 14, 2012 at 03:35:01PM -0800, Julian Elischer wrote:

On 2/14/12 10:38 AM, Kevin Oberman wrote:

On Tue, Feb 14, 2012 at 12:23 AM, Julian Elischerjul...@freebsd.org   wrote:

Has anyone else seen a  problem with top -H -S?

after a short while the screen gets more and more corrupted..

hitting ^L or turning off S   H modes helps .. for a while.

If this is a known fixed problem, let me know but I need to co-ordinate with
others
to upgrade the machine in question.

Not seeing it here on 9-stable. Could it be a display issue? I am
using gnome-terminal with TERM defined as 'xterm'.

yeah I'm on a mac with iterm, but running through 'screen' .

it's never been a problem before.. just since we upgraded to 9-stable.

If you remove GNU screen from the picture does the problem go away?  If
so, I'm not surprised.  :-)

Make sure that when you're using GNU screen, that all shells launched
under/within screen have TERM=screen.  If they don't, then this is
almost certainly the problem -- GNU screen translates between terminal
types, meaning it translates its own terminal type (screen) into
whatever TERM is currently attached (xterm, iterm, whatever).  See
the last 4 paragraphs of my post here to understand what exactly GNU
screen is doing:

http://lists.freebsd.org/pipermail/freebsd-stable/2011-June/063052.html

So, in general, make sure your dotfiles and so on don't mess about with
the $TERM environment variable and you should generally be okay.

it seems to have stopped doing it for no apparent reason

will keep an eye on it. and save this email away for when it does it 
again.



If within GNU screen TERM=screen and you see the problem, but outside of
screen you use TERM=xterm (or something else) but don't see the problem,
then I would almost certainly blame GNU screen.  If you're looking for
something that simply keeps a terminal running in the background, try
nohup or tmux.

Alternately, possibly someone added a screen entry to /etc/termcap on
RELENG_9?  I don't use 9 so I have no way to confirm this, but on 8
there is no such entry.


SC|screen|VT 100/ANSI X3.64 virtual terminal:\

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: known problems with 8.x and HP DL16 G5 server?

2012-02-11 Thread Julian Elischer


On 2/9/12 10:24 PM, Jeremy Chadwick wrote:

On Thu, Feb 09, 2012 at 04:02:12PM -0800, Julian Elischer wrote:

On 2/9/12 1:56 PM, Jeremy Chadwick wrote:

On Thu, Feb 09, 2012 at 01:48:29PM -0800, Julian Elischer wrote:

does anyone know of problems with freebsd and this system?

the kernel We tried to boot seems to stop somewhere in the ahci probing.

Few things:

1) Possible to get full console output (e.g. serial, etc.) from a verbose
boot?

it's freebsd 8.2 from a TrueNAS/FreeNAS. I'm actually at ix-systems
at the
moment.. but I wasnhoping someone could save us some time by saying
Oh yeah, merge in change number xx


2) Can you also provide the exact release/tag/kernel/thing you're trying
to install or upgrade to (8.x is a little vague; there are all sorts
of changes that happen between tags).  For example 8.1 is not going to
behave the same necessarily as 8.2.

3) When you say ahci probing, are you booting a standard installation
CD/DVD/memstick of, say, 8.2?  If so, those won't make use of the
AHCI-to-CAM translation layer (and that AHCI code is also different than
the native-ATA-AHCI code), so you might try, when booting the system,
dropping to the loader prompt and issuing load ahci.ko before typing
boot.  See if that helps.  If it does, great, use it (ahci_load=yes
in /boot/loader.conf) permanently (and benefit from things like NCQ
too).

let me forward you an image...

4) If it's an Intel ESB2 controller, I believe there were some fixes or
identification shims put in place for this in recent RELENG_8, which
wouldn't be available in RELENG_8_2 or 8.2-RELEASE CD/DVDs.  I could be
remembering the wrong controller though.  Hmm...


that may be what we are looking for.

I'll try get more info.

For others: the last few lines in the kernel log are:

acpi_hpet0:High Precision Event Timer  iomem 0xfed0-0xfed003ff on acpi0
acpi_hpet0: vend: 0x8086 rev: 0x1 num: 3 hz: 14318180 opts: legacy_route 64-bit
Timecounter HPET frequency 14318180 Hz quality 900
acpi: wakeup code va 0xff848311d000 pa 0x4000
ahc_isa_probe 0: ioport 0xc00 alloc failed

I don't see any indication of AHCI problems here (or AHCI at all).
ahc_isa_probe is for the ahc(4) controller -- Adaptec SCSI.

A verbose boot might be more helpful.


turns out that the HP machine has an HP branded (and with different 
firmware) raid controller
that is not quite the same as the standard one. FreeBSD can't handle 
it and dies.


Josh Paetzel may remember the exact type.. I forget..



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

known problems with 8.x and HP DL16 G5 server?

2012-02-09 Thread Julian Elischer


does anyone know of problems with freebsd and this system?

the kernel We tried to boot seems to stop somewhere in the ahci probing.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: kernel debugging and ULE

2012-02-08 Thread Julian Elischer


On 2/7/12 1:50 AM, Andriy Gapon wrote:

on 06/02/2012 07:52 Julian Elischer said the following:

so if I'm sitting still in the debugger for too long, a hardclock
event happens that goes into ULE, which then hits the following KASSERT.


KASSERT(pri= PRI_MIN_BATCH  pri= PRI_MAX_BATCH,
 (sched_priority: invalid priority %d: nice %d, 
 ticks %d ftick %d ltick %d tick pri %d,
 pri, td-td_proc-p_nice, td-td_sched-ts_ticks,
 td-td_sched-ts_ftick, td-td_sched-ts_ltick,
 SCHED_PRI_TICKS(td-td_sched)));


The reason seems to be that I've been sitting still for too long and things have
become pear shaped.


how is it that being in the debugger doesn't stop hardclock events?
is there something I can do to make them not happen..
It means I have to ge tmy debugging done in less than about 60 seconds.

suggesions welcome.

Does this really happen when you just sit in the debugger?
Or does it happen when you let the kernel run?  Like stepping through the code,
etc


good point.. I was doing some single stepping..

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: problem with kgdb and modules. (k)gdb expert needed.

2012-02-05 Thread Julian Elischer

In 9.x ( can't check -current, but teh mailing list has a better 
readership)


I'm still seeing this and have still not found any solution:
possible reasons for the change may be:
1/ change to kgdb?
2/ change to the compiling toolset?
3/ change to the .mk files for compiling modules?

any guidance would be appreciated..
The reason I can get away with using FreeBSD ar work is because I can 
debug modules well
as in Linux this is generally a problem.. Now I see similar breakage 
in freebsd.  (sigh)).


I really don't know where to start looking for this..

Julian

On 2/3/12 11:55 PM, Julian Elischer wrote:
so We upgraded our development machines from 8 stable to 9 stable. 
and now kgdb can't debug inside modules.


instead of getting anything useful, we just get:

(kgdb) bt
#0  0x81814600 in ?? () from /boot/kernel/netgraph.ko
#1  0x81812d80 in ?? () from /boot/kernel/ng_socket.ko
#2  0x0037 in ?? ()
#3  0x0002 in ?? ()
#4  0xfe0007176aa0 in ?? ()
#5  0xfe0007176aa0 in ?? ()
#6  0x818134a0 in ?? () from /boot/kernel/ng_socket.ko
#7  0x81813960 in ?? () from /boot/kernel/ng_socket.ko
#8  0xff860fa3cad0 in ?? ()
#9  0x808cc76e in socreate (dom=Variable dom is not 
available.

) at ../../../kern/uipc_socket.c:411



but stopping in the kernel itself, we DO see stuff..

(kgdb) break socreate
Breakpoint 1 at 0x808cc628: file 
../../../kern/uipc_socket.c, line 372.

(kgdb) c
Continuing.



[New Thread 100198]
[Switching to Thread 100198]

Breakpoint 1, socreate (dom=32, aso=0xff860fa3caf0, type=2, 
proto=1, cred=0xfe000c63f600, td=0xfe011501a000) at 
../../../kern/uipc_socket.c:372

372 if (proto)
(kgdb) bt
#0  socreate (dom=32, aso=0xff860fa3caf0, type=2, proto=1, 
cred=0xfe000c63f600, td=0xfe011501a000) at 
../../../kern/uipc_socket.c:372
#1  0x808cf710 in sys_socket (td=0xfe011501a000, 
uap=0xff860fa3cbc0) at ../../../kern/uipc_syscalls.c:199
#2  0x80b5599a in amd64_syscall (td=0xfe011501a000, 
traced=0) at subr_syscall.c:131
#3  0x80b40b57 in Xfast_syscall () at 
../../../amd64/amd64/exception.S:387

#4  0x0008011c82ac in ?? ()



etc.

it looks as if modules no longer have stack frames compiled in.
does anyone know the culprit?

___
freebsd-curr...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
freebsd-current-unsubscr...@freebsd.org




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: problem with kgdb and modules. (k)gdb expert needed.

2012-02-05 Thread Julian Elischer


On 2/5/12 3:05 AM, Andriy Gapon wrote:

on 05/02/2012 09:58 Julian Elischer said the following:

In 9.x ( can't check -current, but teh mailing list has a better readership)

I'm still seeing this and have still not found any solution:
possible reasons for the change may be:
1/ change to kgdb?
2/ change to the compiling toolset?
3/ change to the .mk files for compiling modules?

any guidance would be appreciated..
The reason I can get away with using FreeBSD ar work is because I can debug
modules well
as in Linux this is generally a problem.. Now I see similar breakage in
freebsd.  (sigh)).

I really don't know where to start looking for this..

Julian,

just in case, how about some basic stuff like checking that the modules are
indeed built with debugging support, that .symbols are installed and are
accessible, that kgdb produces those messages: Reading symbols, Loaded 
symbols.


it seems to have been some timing issue.
the scripts that ran in 8.x fail to load the symbols in 9.x but if I 
do the commands again by hand,

it does load them..

so it seems to be a false alarm

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

kernel debugging and ULE

2012-02-05 Thread Julian Elischer


so if I'm sitting still in the debugger for too long, a hardclock
event happens that goes into ULE, which then hits the following KASSERT.


   KASSERT(pri = PRI_MIN_BATCH  pri = PRI_MAX_BATCH,
(sched_priority: invalid priority %d: nice %d, 
ticks %d ftick %d ltick %d tick pri %d,
pri, td-td_proc-p_nice, td-td_sched-ts_ticks,
td-td_sched-ts_ftick, td-td_sched-ts_ltick,
SCHED_PRI_TICKS(td-td_sched)));


The reason seems to be that I've been sitting still for too long and 
things have become pear shaped.



how is it that being in the debugger doesn't stop hardclock events?
is there something I can do to make them not happen..
It means I have to ge tmy debugging done in less than about 60 seconds.

suggesions welcome.


Julian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Escaping from a jail with root privileges on the host

2011-12-28 Thread Julian Elischer


On 12/28/11 12:58 AM, Marin Atanasov Nikolov wrote:

Hello,

Today I've managed to escape from a jail by accident and ended up with
root access to the host's filesystem.

Here's what I did:

  * Using ezjail for managing my jails
  * Verified in FreeBSD 9.0-BETA3 and 9.0-RC3
  * This works only when I use sudo, and cannot reproduce if I execute
everything as root

First, created a folder *inside* the jail and cd to it:

  host$ sudo ezjail-admin console jail-test

  jail-test# id
  uid=0(root) gid=0(wheel) groups=0(wheel),5(operator)

  jail-test# mkdir ~/jail-folder
  jail-test# cd ~/jail-folder

  jail-test# pwd
  /root/jail-folder

Then from the host machine I've moved this folder to the cwd.

host$ pwd
/usr/home/mra

host$ sudo mv /home/jails/jail-test/root/jail-folder .

And then here's where the jail ends up :)

  jail-test# pwd
  /usr/home/mra/jail-folder

 From here on the Jail's root user has full root privileges to the
host's filesystem.

Not sure if it is sudo or jail issue, and would be nice if someone
with more experience can check this up :)


This is not really escaping.
It's more like being sprung by your friends outside since
it requires outside participation.
The jailed process cannot do it by itself.
Now what would be more interesting is if the jailed process can
make a new jail inside the old jail and then 'spring' the inmate there.
will that inmate be still inside the parent jail, or outside both jails?


Regards,
Marin



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TTY task group scheduling

2010-11-18 Thread Julian Elischer


On 11/18/10 10:55 AM, Lucius Windschuh wrote:

2010/11/18 Andriy Gapona...@freebsd.org:

[Grouping of processes into TTY groups]

Well, I think that those improvements apply only to a very specific usage 
pattern
and are greatly over-hyped.

But there are serious issue if you use FreeBSD as a desktop OS with
SMP and SCHED_ULE, or?
Because currently, my machine is barely usable if a compile job with
parallelism is running. Movies stutter, Firefox hangs. And even nice
-n 20 doesn't do the job in every case, as +20 seems not to be the
idle priority anymore?!?
And using idprio 1 $cmd as a workaround is, well, a kludge.
I am not sure if TTY grouping is the right solution, if you look at
potentially CPU-intensive GUI applications that all run on the same
TTY (or no TTY at all? Same problem).
Maybe, we could simply enhance the algorithm that decides if a task is
interactive? That would also improve the described situation.


tty grouping is a variant of what we used to have at one stage which is
a kernel schedulable entity group.. KSEG

the idea is that all items in a group share some characteristic and 
some amount

of resources.

We stripped the KSEG out of the picture because it really complicated 
the picture.




Regards,

Lucius
___
freebsd-curr...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TTY task group scheduling

2010-11-18 Thread Julian Elischer


On 11/18/10 3:37 PM, Alexander Best wrote:

On Fri Nov 19 10, Daniel Nebdal wrote:

On Fri, Nov 19, 2010 at 12:06 AM, Alexander Kabaevkab...@gmail.com  wrote:

On Thu, 18 Nov 2010 18:56:35 +
Alexander Bestarun...@freebsd.org  wrote:


On Thu Nov 18 10, Matthew D. Fuller wrote:

On Thu, Nov 18, 2010 at 06:23:24PM + I heard the voice of
Alexander Best, and lo! it spake thus:

judging from the videos the changes are having a huge impact imo.

Well, my (admittedly limited, and certainly anecdotal) experience is
that Linux's interactive response when under heavy load was always
much worse than FreeBSD's.  So maybe that's just them catching up to
where we already are   ;)

well...i tried playing back a 1080p vide files while doing
`make -j64 buildkernel` and FreeBSD's interactivity seems far from
perfect.

One thing that just begs to be asked: since when decoding 1080p became
an interactive task?


Strictly speaking it isn't - but displaying it is a timing-sensitive
task that isn't CPU- or I/O-bound, and scheduling-wise that probably
makes it more like the fast response when woken up interactive tasks
than a CPU-bound non-interactive process.
Decoding it into another file on the disk is in the latter category,
of course - but I don't think that's what he meant. :)

More on topic - while this was a tiny patch for Linux, it seems like
it would take more work for us, since I don't believe either of the
schedulers handles task groups in the required way. The linux patch
was just create task groups automatically, since they already had
some suitable logic for scheduling based on task groups in their CFS
scheduler. We would have to (re-)add that first, which is non-trivial.

personally i think freebsd would hugely benefit from a scheduler framework
such as geom/gsched, where it's easy to switch between various algorithms.

that way it be much easier to try out new concepts without having to write a
completely new scheduler.


we are part of the way there..

at least we did abstract the scheduler to the point where
we have two completely different ones.
 you are welcome to develop a 'framework as you describe and plug it into
the abstraction we already have.


cheers.
alex



--
Daniel Nebdal


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: PF + BRIDGE still causes system freezing

2010-05-28 Thread Julian Elischer


On 5/28/10 3:54 AM, Giulio Ferro wrote:

On 28.05.2010 07:46, Giulio Ferro wrote:

Would it be a good idea to try netgraph bridge?
Or the underlying implementation is the same as in if_bridge?


netgraph bridging (see /usr/share/examples/netgraph) is a completely
different implimentation with different strengths and weaknesses.
you may find it works for you.






Months ago I reported a system freezing whenever bridge was used
with pf. This still happens now in 8.1 prerelease: after several
minutes to hours
that the bridge is active the system becomes unresponsive.

# uname -a
FreeBSD firewall1 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #0: Thu May 27
18:03:48 CEST 2010 r...@data1:/usr/obj/usr/src/sys/FIREWALL amd64


cat /etc/sysctl.conf

net.inet.ip.forwarding=1
net.inet.ip.fastforwarding=1
net.inet.carp.preempt=1

Services running : sshd, named, inetd, ntpd, openvpn (tap), racoon,
pptp, asterisk

2 physical interfaces : bce0, bce1
11 vlan interfaces : vlan1, ..., vlan11 (vlandev bce1)
11 carp interfaces ; carp1, ..., carp11 (carp1 has 23 alias addresses)
1 bridge interfaces : bridge0 addm vlan35 (used by openvpn)
2 gif interfaces : gif0, gif1 (racoon / IPSEC)

8 static routes

pf packet filter : 12 rdr rules, 3 nat rules, set skip{lo0, bridge0,
vlan35}, 4 pass quick, block log all, about 30 pass keep state



When the system freezes, I get this from the debugger
-
db show allchains
db show alllocks
Process 12 (intr) thread 0xff00024293e0 (100028)
exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xff000270ea18)
locked @ /usr/src/sys/net/if_bridge.c:2184
Process 12 (intr) thread 0xff00022693e0 (100016)
exclusive sleep mutex Giant (Giant) r = 1 (0x80c93dc0) locked
@ /usr/src/sys/dev/usb/usb_transfer.c:3023
Process 12 (intr) thread 0xff00022607c0 (106)
exclusive sleep mutex carp_if (carp_if) r = 0 (0xff00027329e0)
locked @ /usr/src/sys/netinet/ip_carp.c:881
db
-

Even if there is no solution yet, is there any quick and dirty
workaround I can try?
I need this rather badly...

Thanks.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: net/mpd5, ppp, proxy-arp issues

2010-04-26 Thread Julian Elischer


On 4/26/10 1:11 AM, Stefan Esser wrote:

Am 22.04.2010 20:43, schrieb Marin Atanasov:

Hello,

Thanks a lot for the patch, Qing!

It works fine. However I've noticed one thing, after I start mpd5 and
connect to my home network:

kernel: WARNING: attempt to domain_add(netgraph) after domainfinalize()

Not very sure if this is something to worry about or not?


There was a problem with the initialization order of network domains,
which caused kernel crashes with ISDN+INET6 some two years ago. The
reason was, that there was an implicit assumption, that all domains
were initialized when the network interfaces are initialized, with
NULL dereferences if domains are added (and relevant to a device)
after the device has been initialized.

I debugged this problem and prepared a patch for discussion, which
later was committed by Max Laier (if memory serves me right). The
message was added in order to identify further situations, where
network domains are added after network interfaces have been
initialized. This message ought to be informational right now, since
the interface init is repeated whenever a network domain is added
as part of above mentioned patch. Init order should be fixed, if
this message is printed for compiled in drivers, but in case of a
kernel module (like netgraph) that adds a domain, it is unadvoidable
that the init order is reversed.

Perhaps the message should be made conditional on the start-up of
the kernel not having finished, or it should be completely removed,
since time has shown, that the init order is correct in general.

I'll remove that message (or make it conditional on bootverbose)
unless there is opposition to this change ...

please do..

it's an unavoidable thing that domains added after boot
are done after boot completes   :-)


Regards, STefan
___
freebsd-...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Routing question (GRE packet vs normal traceroute)?

2009-12-24 Thread Julian Elischer


Xin LI wrote:

Hi,

A friend of mine has encountered some problem in his setup which
consists a pair of GRE peer, one running on OpenBSD and another
running FreeBSD 7.2-RELEASE; with 7.2-STABLE, there is no improvement
over the situation.  The problem we have observed seems to be related
to GRE packet not being routed as observed, here is some details:

 - The FreeBSD box has one network interface connected to two (2)
upstream network, with different IP and does not belong to the same
subnet, say, one is 1.2.3.4/24 and another is 5.6.7.8/24
 - The default gateway can be reached through the first IP address
bound to the network interface;
 - An explicit route has been configured to the OpenBSD host, the
gateway being used can be reached directly via the secondary (aliased
5.6.7.8/24) IP.
 - Both the default gateway and the explicit host route can reach the
OpenBSD route.

The problem they had is, while traceroute to the OpenBSD host can give
the desired result, however, packets that is supposed to be
transferred through the GRE tunnel, while they will be encapsulated
into a GRE packet, the GRE packet itself won't go to the explicit host
route, but end up going to the default gateway.

The friend has configured his switch to bounce the packet back to
the server by configuring a host route on L3 switch, and it seems that
the FreeBSD box is able to route the GRE packet to its desired gateway
this time.

Any suggestions?


there is a hack in the GRE code that you can turn off where the GRE
envelope is looking up the address of the peer *WITH THE LAST BIT 
SWITCHED*


try adding a route to the address of the openBSD host with /31 (not 32)

I forget how to turn it off but th man page says.

there IS a good reason for it if you want packets for the OpenBSD host 
itself to go through the tunnel.. Then you need to not use that 
address itself or you get a routing loop.







Cheers,


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: recently happend kernel panics regarding usb

2009-01-27 Thread Julian Elischer


Oliver Lehmann wrote:

Markus Hitter wrote:

If you throw the EHCI driver out of the kernel your drive will use  
either OHCI or UHCI (both are slow). This seems to help, at least for  
the limited things I use this pen drive now.


I'm not sure, that this g_vfs_done is related to the panic. I've attached
the drive to an uhci drived port on the same machine, started an fsck and
I've got an immediate panic:

trying to sleep while sleeping is prohibited


when you hold a mutex in the kernel you are not allowed to go to sleep
as other kernel actors may need that mutex..
OR a interrupt thread is trying to sleep.
I doubt it has anything to do with a usb device hibernating.






If I remember it correctly. The driver has some power saving feature
which shuts the drive down if it is not used for some time and spins it
up when a request arrives. But yesterday I powered the drive up... waited
some secunds and started then a fsck. So I guess it was not in a
shutdown state - So I wonder who requested a sleep ;)





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: FreeBSD 6.3 gre and traceroute

2008-11-14 Thread Julian Elischer


Stephen Clark wrote:

Stephen Clark wrote:




10.0.129.1 FreeBSD workstation
 ^
 |
 | ethernet
 |
 v
10.0.128.1 Freebsd FW A
 ^
 |
 | gre / ipsec
 |
 v
192.168.3.1 FreeBSD FW B
 ^
 |
 | ethernet
 |
 v
192.168.3.86 linux workstation



Also just using gre's without the 
underlying ipsec tunnels seems to

work properly.



This is the crux of the matter.
IPSEC happens INSIDE the IP stack. The IP stack is responsible for
the ICMP generation so it is much more likely that there is an 
interaction there.


Now is there an IPSEC rule to make sure that the ICMP packet can get 
back?  It could b ehtat in teh IP stack there is some confusion as to 
whether the return packet should be encrypted or not and it might get 
dropped.


the code involved is in /sys/netinet and /sys/netipsec but you'll
probably regret looking in there ;-)







Another data point I had been using option FILTER_GIF I tried a kernel
without that option and it behaved the same.

Steve



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: machine hangs on occasion - correlated with ssh break-in attempts

2008-08-21 Thread Julian Elischer

Kevin Oberman wrote:

Date: Thu, 21 Aug 2008 13:38:38 -0400
From: Mikhail Teterin [EMAIL PROTECTED]
Sender: [EMAIL PROTECTED]

Hello!

A machine I manage remotely for a friend comes under a distributed ssh 
break-in attack every once in a while. Annoyed (and alarmed) by the 
messages like:

Aug 12 10:21:17 symbion sshd[4333]: Invalid user mythtv from 85.234.158.180
Aug 12 10:21:18 symbion sshd[4335]: Invalid user mythtv from 85.234.158.180
Aug 12 10:21:20 symbion sshd[4337]: Invalid user mythtv from 85.234.158.180
Aug 12 10:21:21 symbion sshd[4339]: Invalid user mythtv from 85.234.158.180

I wrote an awk-script, which adds a block of the attacking IP-address to 
the ipfw-rules after three such invalid user attempts with:

ipfw add 550 deny ip from ip

The script is fed by syslogd directly -- through a syslog.conf rule 
(|/opt/sbin/auth-log-watch).

Once in a while I manually flush these rules... I this a good (safe) 
reaction?
I'm asking, because the machine (currently running 7.0 as of July 7) 
hangs solid once every few weeks... My only guess is that a spike in 
attacks causes too many ipfw-entries created, which paralyzes the 
kernel due to some bug -- the machine is running natd and is the gateway 
for the rest of the network...
The hangs could, of course, be caused by something else entirely, but my 
self-defense mechanism is my first suspect...

Any comments? Thanks!

also, if you do this, have a single rule that uses a table
and add the addresses to the table.

Looks remarkably like sshguard (ports/security/sshguard-*). It does almost
exactly what you are doing but is written in C and has command-line
switches to set how long a system is blocked, how many attempts
constitute an attack and how long it should remember failed attempts. It
also allows the use of back-end scripts if you want it to do something
else such as generate reports (beyond an entry in /var/log/messages).

As far as the hangs, I don't believe it is from the large nu,ber of
brute force attempts as they will stop for a given host as soon as the
firewall is updated. I seldom see more than a handful of attack sources
over any short period.

Should you want to continue with your own tool, at least for IPv4,
consider using tables rather than a raft of rules. With tables, you need
only a single rule and it is there at boot time.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

PCBSD X11 ugh

2008-03-11 Thread Julian Elischer


So, I tried out PCBSD on a Dell DHP (what ever that is) (made feb '05).

it installs great
but when I run boot it, (FreeBSD comes up fine) teh X server goes into 
an infinite loop somewhere.


  168 root  1   00   148M  6976K rdnrel   0:57 93.85% Xorg

and the screen stays black.
the fan goes onto tornado mode and it just sits there.


attempting to send a signal -9 to the x server has no effect so its 
stuck in the kernel somewhere.


does anyone have any X11 foo (or PCBSD foo) to let me know how to get
the damed server to do what it did in install, when it was just fine.
possibly I need to disable some kernel extension feature..

I'm not really sure what info I need to send.
every 4 years I get hit by some X11 thing like this.. just far
enough appart that I've forgotten all I knew, and the whole
X11 landscape has changed since I last looked at it..

maybe I'll try desktopBSD.. hmm another download..


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Analysis of disk file block with ZFS checksum error

2008-02-08 Thread Julian Elischer


Joe Peterson wrote:

Chris Dillon wrote:
That is a chunk of a Mozilla Mork-format database.  Perhaps the  
Firefox URL history or address book from Thunderbird.


Interesting (thanks to all who recognized Mork).  I do use Firefox and
Thunderbird, so it's feasible, but how the heck would a piece of one of
those files find its way into 1/2 of a ZFS block in one of my mp3 files?
   I wonder if it could have been done on write when the file was copied
to the ZFS pool (maybe some write-caching issue?), but I thought ZFS
would have verified the block after write.  It seems unlikely that it
would g


it could be an old file..
what kind of disks?
I had a scenario where 3ware controllers were just failing to write to
a drive in the array, so old data showed through.

it was possible by looking to see where the boundary between good and 
bad was, to identify the culprit..


the filesystem and the partitions and the raids all were on different
alignments so teh only part of the system that had a boundary that 
aligned with the bad data was the physical stripes laid down by the 
controller.  It was 64k stripes and 64k data missing, exactly on
stripe boundaries. Due to the fact that FreeBSD had partitioned the 
drive staring at 63 blocks in, nothing else aligned with the problem.




-Joe
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Analysis of disk file block with ZFS checksum error

2008-02-08 Thread Julian Elischer


Joe Peterson wrote:

Julian Elischer wrote:

it could be an old file..
what kind of disks?


It's a Seagate ST3500630A parallel ATA drive.


I had a scenario where 3ware controllers were just failing to write to
a drive in the array, so old data showed through.


I have an Intel ICH4 controller - nothing unusual.


the filesystem and the partitions and the raids all were on different
alignments so teh only part of the system that had a boundary that 
aligned with the bad data was the physical stripes laid down by the 
controller.  It was 64k stripes and 64k data missing, exactly on
stripe boundaries. Due to the fact that FreeBSD had partitioned the 
drive staring at 63 blocks in, nothing else aligned with the problem.


Hmm, well this is a straight-forward disk situation - never used RAID on
this drive.  Give what is happening, I wonder the changes of it being
HW, OS, or a filesystem issue.

-Joe


still, see whether the 64k lines up with the drive or with
the filesystem (if the filesystem is not on an exact 64k boundary
of the drive).
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Packet loss every 30.999 seconds

2007-12-19 Thread Julian Elischer


David G Lawrence wrote:

 In any case, it appears that my patch is a no-op, at least for the
problem I was trying to solve. This has me confused, however, because at
one point the problem was mitigated with it. The patch has gone through
several iterations, however, and it could be that it was made to the top
of the loop, before any of the checks, in a previous version. Hmmm.

The patch should work fine.  IIRC, it yields voluntarily so that other
things can run.  I committed a similar hack for uiomove().  It was


   It patches the bottom of the loop, which is only reached if the vnode
is dirty. So it will only help if there are thousands of dirty vnodes.
While that condition can certainly happen, it isn't the case that I'm
particularly interested in.


CPUs, everything except interrupts has to wait for these syscalls.  Now
the main problem is to figure out why PREEMPTION doesn't work.  I'm
not working on this directly since I'm running ~5.2 where nearly-full
kernel preemption doesn't work due to Giant locking.


   I don't understand how PREEMPTION is supposed to work (I mean
to any significant detail), so I can't really comment on that.


It's really very simple.

When you do a wakeup 
(or anything else that puts a thread on a run queue)

i.e.  use setrunqueue()
then if that thread has more priority than you do, (and in the general case
is an interrupt thread), you immedialty call mi_switch so that it runs 
imediatly.
You get guaranteed to run again when it finishes. 
(you are not just put back on the run queue at the end).


the critical_enter()/critical_exit() calls disable this from happening to 
you if you really must not be interrupted by another thread.


there is an option where it is not jsut interrupt threads that can jump in,
but I think it's usually disabled.




-DG

David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
Pave the road of life with opportunities.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: no matching session in ng_pppoe.c 1.74.2.4? (RELENG_6)

2007-12-17 Thread Julian Elischer


cpghost wrote:

On Sun, 09 Dec 2007 14:01:27 -0800
Julian Elischer [EMAIL PROTECTED] wrote:


cpghost wrote:

On Sun, 09 Dec 2007 11:13:13 -0800
Julian Elischer [EMAIL PROTECTED] wrote:


--- manually restarting ppp(1), then:


17:10:47.306928 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x40C663C1]
  [Service-Name]

17:10:47.306939 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0xC06220C1]
  [Service-Name]

we still have 2 sessions instead of 1, but there is less confusion 
so things sort themselves out.

Just one more thing:

If I remember correctly, sending two PADIs in quick succession
was ppp's normal behaviour for *years* now (is it expected or
required by the protocol? I don't know). I've always wondered
why it was so. But that didn't cause any harm as it seemed one
of the two PADO was picked up and eventually turned into a session.

-cpghost.


btw try mpd as well.


So... I'm running net/mpd5 on that router for a few days now, and
it managed 3 forced disconnects in a row and no session chaos at
all, while ppp(8) would probably have initiated a lot of parallel
sessions again but no connection.

So up until now (but perhaps it's too early to be sure?),
net/mpd5 is fine, while ppp(8) is not.

Btw, I've compared the sources of ppp(8) from 2007-09-24/25
when it was still working, and 2007-11-30 when I've updated
the router, and there's NO difference there at all. Whatever
broke ppp(8), it was not ppp(8) but something else
(I suspect ng_pppoe.c): maybe the code clean up exposed
some hidden bug in ppp(8)?

I hope ppp(8) will be fixed before 6.3-RELEASE; even though
net/mpd5 is excellent and very snappy as well. ;-)


mpd is also using ng_pppoe of course.



Regards,
-cpghost.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: no matching session in ng_pppoe.c 1.74.2.4? (RELENG_6)

2007-12-09 Thread Julian Elischer


cpghost wrote:

On Thu, 6 Dec 2007 16:11:07 +0100
cpghost [EMAIL PROTECTED] wrote:


On Thu, 06 Dec 2007 13:57:16 +0200
Alexander Motin [EMAIL PROTECTED] wrote:


cpghost wrote:

The problem is that the last mile carrier of the PPP provider
that this router is attached to disconnects the ppp session
forcibly once every 24h. Before the update, ppp would detect
this and reconnect immediately. After the update, ppp doesn't
recover gracefully from this anymore, but spits out on the
console:

ng_pppoe[5]: no matching session

for hours, and tries to connect again every two minutes without 
success, until I manually stop and restart the userland ppp daemon

(and then the connection is immediately restored with a new
session). I've tried this for a few days now, and it is always the
same: it's definitely not a problem on the provider's side: As
soon as ppp restarts, it gets a new session without any problems
and connects again.

Since the last working sources were from 2007/09/25, and
ng_pppoe.c was at rev. 1.74.2.3; and the new revision of
ng_pppoe.c is now at 1.74.2.4; I'm suspecting that whatever
was changed there could be the cause (because this no matching
session is being logged from there).

I have tested and unable to reproduce that myself with ppp - mpd or
mpd
- - mpd PPPoE connections. Actually I am not sure about any
difference between reconnect and ppp restart. From the ng_pppoe node
point of view it should be the same.

Could you provide tcpdump output for connection tries from your
Ethernet interface? Use -pes 0 options please.

Will do; but I'll first have to wait 24h from now to get a
forcibly disconnected session (I've just had to restart ppp
again).


All right, I've got a good tcpdump now:

# tcpdump -i sis0 -n -pes 0
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode listening on sis0, link-type EN10MB (Ethernet), capture size
65535 bytes

17:06:08.400881 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff,
  ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq
  0xC0C263C1] [Service-Name]

17:06:08.400891 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype PPPoE
  D (0x8863), length 32: PPPoE PADI [Host-Uniq 0xC0ED45C1]
  [Service-Name]

17:06:08.400898 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype PPPoE
  D (0x8863), length 32: PPPoE  PADI [Host-Uniq 0x40C263C1]
  [Service-Name]

17:06:08.400904 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff,
  ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq
  0x80CA63C1] [Service-Name]

17:06:08.400910 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype PPPoE
  D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x80C963C1]
  [Service-Name]



why are  we sending PADIs for 4 different sessions?
what does ngctl list
show?




17:06:08.528227 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype PPPoE
  D (0x8863), length 66: PPPoE PADO [AC-Name DSSX43-erx] [Host-Uniq
  0xC0C263C1] [Service-Name] [AC-Cookie ..7\t.K.,.!y.y.E]




they respond to the first one we sent.
this info should now be sent the ppp daemon.



17:08:08.488679 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype PPPoE
  D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x806320C1]
  [Service-Name]

17:08:08.488690 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype PPPoE
  D (0x8863), length 32: PPPoE  PADI [Host-Uniq 0x40D063C1]
  [Service-Name]

17:08:08.488696 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff,
  ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq
  0x00C063C1] [Service-Name]

17:08:08.488702 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype PPPoE
  D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x40CE63C1]
  [Service-Name]

17:08:08.488708 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype PPPoE
  D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x80EC45C1]
  [Service-Name]



hey these are 4 more new pppoe sessions making 8
Are they just accumlating from each try?  maybe we are not cleaning up?



17:08:08.552036 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype PPPoE
  D (0x8863), length 66: PPPoE  PADO [AC-Name DSSX43-erx]
  [Host-Uniq 0x806320C1] [Service-Name] [AC-Cookie ..7\t.K.,.!y.y.E]

17:08:08.557191 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype PPPoE
  D (0x8863), length 66: PPPoE PADO [AC-Name DSSX43-erx] [Host-Uniq
  0x40D063C1] [Service-Name] [AC-Cookie ..7\t.K.,.!y.y.E]

17:08:08.572971 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype PPPoE
  D (0x8863), length 66: PPPoE PADO [AC-Name DSSX43-erx] [Host-Uniq
  0x00C063C1] [Service-Name] [AC-Cookie ..7\t.K.,.!y.y.E]

17:08:08.577148 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype PPPoE
  D (0x8863), length 66: PPPoE PADO [AC-Name DSSX43-erx] [Host-Uniq
  0x40CE63C1] [Service-Name] [AC-Cookie ..7\t.K.,.!y.y.E]

17:08:08.581343 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype PPPoE
  D (0x8863), length 66: PPPoE PADO [AC-Name DSSX43-erx] [Host-Uniq
  0x80EC45C1] [Service-Name] [AC-Cookie ..7\t.K.,.!y.y.E]



router has responded to 4 more..  they should all be reporting back to
whatever started them.


some retries..
I haven't looked at

Re: no matching session in ng_pppoe.c 1.74.2.4? (RELENG_6)

2007-12-09 Thread Julian Elischer


cpghost wrote:

On Sun, 09 Dec 2007 11:13:13 -0800
Julian Elischer [EMAIL PROTECTED] wrote:


cpghost wrote:

On Thu, 6 Dec 2007 16:11:07 +0100
cpghost [EMAIL PROTECTED] wrote:


On Thu, 06 Dec 2007 13:57:16 +0200
Alexander Motin [EMAIL PROTECTED] wrote:


cpghost wrote:

The problem is that the last mile carrier of the PPP provider
that this router is attached to disconnects the ppp session
forcibly once every 24h. Before the update, ppp would detect
this and reconnect immediately. After the update, ppp doesn't
recover gracefully from this anymore, but spits out on the
console:

ng_pppoe[5]: no matching session

for hours, and tries to connect again every two minutes without 
success, until I manually stop and restart the userland ppp

daemon (and then the connection is immediately restored with a
new session). I've tried this for a few days now, and it is
always the same: it's definitely not a problem on the provider's
side: As soon as ppp restarts, it gets a new session without any
problems and connects again.

Since the last working sources were from 2007/09/25, and
ng_pppoe.c was at rev. 1.74.2.3; and the new revision of
ng_pppoe.c is now at 1.74.2.4; I'm suspecting that whatever
was changed there could be the cause (because this no matching
session is being logged from there).

I have tested and unable to reproduce that myself with ppp - mpd
or mpd
- - mpd PPPoE connections. Actually I am not sure about any
difference between reconnect and ppp restart. From the ng_pppoe
node point of view it should be the same.

Could you provide tcpdump output for connection tries from your
Ethernet interface? Use -pes 0 options please.

Will do; but I'll first have to wait 24h from now to get a
forcibly disconnected session (I've just had to restart ppp
again).

All right, I've got a good tcpdump now:

# tcpdump -i sis0 -n -pes 0
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode listening on sis0, link-type EN10MB (Ethernet), capture size
65535 bytes

17:06:08.400881 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff,
  ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq
  0xC0C263C1] [Service-Name]

17:06:08.400891 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0xC0ED45C1]
  [Service-Name]

17:06:08.400898 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE  PADI [Host-Uniq 0x40C263C1]
  [Service-Name]

17:06:08.400904 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff,
  ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq
  0x80CA63C1] [Service-Name]

17:06:08.400910 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x80C963C1]
  [Service-Name]


why are  we sending PADIs for 4 different sessions?
what does ngctl list
show?


Right now, with the working connection:

# ngctl list
There are 6 total nodes:
  Name: ngctl53522Type: socketID: 02ec   Num hooks: 0
  Name: unnamed Type: socketID: 02eb   Num hooks: 1
  Name: unnamed Type: pppoe ID: 0005   Num hooks: 2
  Name: sis2  Type: ether ID: 0003   Num hooks: 0
  Name: sis1  Type: ether ID: 0002   Num hooks: 0
  Name: sis0  Type: ether ID: 0001   Num hooks: 1


17:06:08.528227 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype
PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name DSSX43-erx]
[Host-Uniq 0xC0C263C1] [Service-Name] [AC-Cookie ..7\t.K.,.!y.y.E]



they respond to the first one we sent.
this info should now be sent the ppp daemon.



17:08:08.488679 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x806320C1]
  [Service-Name]

17:08:08.488690 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE  PADI [Host-Uniq 0x40D063C1]
  [Service-Name]

17:08:08.488696 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff,
  ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq
  0x00C063C1] [Service-Name]

17:08:08.488702 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x40CE63C1]
  [Service-Name]

17:08:08.488708 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x80EC45C1]
  [Service-Name]


hey these are 4 more new pppoe sessions making 8
Are they just accumlating from each try?  maybe we are not cleaning
up?


17:08:08.552036 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype
PPPoE D (0x8863), length 66: PPPoE  PADO [AC-Name DSSX43-erx]
  [Host-Uniq 0x806320C1] [Service-Name] [AC-Cookie
..7\t.K.,.!y.y.E]

17:08:08.557191 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype
PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name DSSX43-erx]
[Host-Uniq 0x40D063C1] [Service-Name] [AC-Cookie ..7\t.K.,.!y.y.E]

17:08:08.572971 00:90:1a:a0:15:b7  00:00:24:c2:45:74, ethertype
PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name DSSX43-erx]
[Host-Uniq 0x00C063C1] [Service-Name

Re: no matching session in ng_pppoe.c 1.74.2.4? (RELENG_6)

2007-12-09 Thread Julian Elischer


cpghost wrote:

On Sun, 09 Dec 2007 11:13:13 -0800
Julian Elischer [EMAIL PROTECTED] wrote:


--- manually restarting ppp(1), then:


17:10:47.306928 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x40C663C1]
  [Service-Name]

17:10:47.306939 00:00:24:c2:45:74  ff:ff:ff:ff:ff:ff, ethertype
PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0xC06220C1]
  [Service-Name]

we still have 2 sessions instead of 1, but there is less confusion 
so things sort themselves out.


Just one more thing:

If I remember correctly, sending two PADIs in quick succession
was ppp's normal behaviour for *years* now (is it expected or
required by the protocol? I don't know). I've always wondered
why it was so. But that didn't cause any harm as it seemed one
of the two PADO was picked up and eventually turned into a session.

-cpghost.



btw try mpd as well.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

MFC TO 6.X (6.3?) to fix aio_return() ?

2007-11-29 Thread Julian Elischer



This diff is a partial MFC (picking parts out of -current)
that makes aio_return() return the error return of a completed AIO 
request. (as it does on othe OS's and  in 7.x).


The man page for 6.x and other OS's indicate that aio_return 
shoud return all the same results as a returning read() or write()

including setting errno  on error.

in 6.x this does not happen. on 7.0 it does.

The included test program can show the result when using gnop()
to simulate IO errors.

BTW the test program could be used as a start to sample code 
as to how to use kqueue and aio together.



If people agree this is worth fixing,  it would be nice to get it in 6.3




Index: vfs_aio.c
===
RCS file: /home/ncvs/src/sys/kern/vfs_aio.c,v
retrieving revision 1.195.2.4
diff -d -u -r1.195.2.4 vfs_aio.c
--- vfs_aio.c   9 Sep 2006 01:30:11 -   1.195.2.4
+++ vfs_aio.c   29 Nov 2007 19:26:12 -
@@ -1529,6 +1529,7 @@
struct aiocblist *cb, *ncb;
struct aiocb *ujob;
struct kaioinfo *ki;
+   int status, error;
 
ujob = uap-aiocbp;
jobref = fuword(ujob-_aiocb_private.kernelinfo);
@@ -1542,14 +1543,6 @@
TAILQ_FOREACH(cb, ki-kaio_jobdone, plist) {
if (((intptr_t) cb-uaiocb._aiocb_private.kernelinfo) ==
jobref) {
-   if (cb-uaiocb.aio_lio_opcode == LIO_WRITE) {
-   p-p_stats-p_ru.ru_oublock +=
-   cb-outputcharge;
-   cb-outputcharge = 0;
-   } else if (cb-uaiocb.aio_lio_opcode == LIO_READ) {
-   p-p_stats-p_ru.ru_inblock += cb-inputcharge;
-   cb-inputcharge = 0;
-   }
goto done;
}
}
@@ -1565,15 +1558,33 @@
  done:
PROC_UNLOCK(p);
if (cb != NULL) {
-   if (ujob == cb-uuaiocb) {
-   td-td_retval[0] =
-   cb-uaiocb._aiocb_private.status;
-   } else
-   td-td_retval[0] = EFAULT;
-   aio_free_entry(cb);
-   return (0);
+   status = cb-uaiocb._aiocb_private.status;
+   error = cb-uaiocb._aiocb_private.error;
+   if (ujob != cb-uuaiocb) {
+   /* check for a mismatch. is it possible? */
+   /* (It's not in 7.x) */
+   error = EFAULT;
+   } else {
+   if (error == 0) {
+   td-td_retval[0] = status;
+   }
+   if (cb-uaiocb.aio_lio_opcode == LIO_WRITE) {
+   p-p_stats-p_ru.ru_oublock +=
+   cb-outputcharge;
+   cb-outputcharge = 0;
+   } else if (cb-uaiocb.aio_lio_opcode == LIO_READ) {
+   p-p_stats-p_ru.ru_inblock += cb-inputcharge;
+   cb-inputcharge = 0;
+   }
+   suword(ujob-_aiocb_private.error, error);
+   suword(ujob-_aiocb_private.status, status);
+   aio_free_entry(cb);
+   }
+   } else {
+   /* no such aiocb known */
+   error = EINVAL;
}
-   return (EINVAL);
+   return (error);
 }
 
 /*
#include stdio.h
#include stdlib.h
#include errno.h
#include strings.h
#include signal.h
#include fcntl.h
#include sys/param.h
#include stddef.h
#include sys/aio.h
#include sys/types.h
#include sys/event.h
#include sys/time.h



#define BUFSIZE 512
#define TMOUT_SEC 5
#define TMOUT_NSEC 0

main()
{

int fd;
int ret;
struct aiocbmy_aiocb;
int kq;

if ((kq = kqueue()) == -1)
err(1, kqueue);
fd = open(/dev/mfid0s1d.nop, O_RDONLY);
if (fd  0)
perror(open);

/* Zero out the aiocb structure (recommended) */
bzero((char *)my_aiocb, sizeof(struct aiocb));

/* Allocate a data buffer for the aiocb request */
my_aiocb.aio_buf = malloc(BUFSIZE + 1);
if (!my_aiocb.aio_buf)
perror(malloc);

/* Initialize the necessary fields in the aiocb */
my_aiocb.aio_fildes = fd;
my_aiocb.aio_nbytes = BUFSIZE;
my_aiocb.aio_offset = (512 * (100LL + 10));
my_aiocb.aio_sigevent.sigev_notify = SIGEV_KEVENT;
my_aiocb.aio_sigevent.sigev_notify_kqueue = kq;
/* udata for the created kqueue */
#if __FreeBSD_version  70
my_aiocb.aio_sigevent.sigev_value.sival_ptr = NULL;
#else
my_aiocb.aio_sigevent.sigev_value.sigval_ptr = NULL;
#endif

ret = aio_read(my_aiocb);
if (ret  0)

Re: connect() returns EADDRINUSE during massive host-host conn rate

2007-11-28 Thread Julian Elischer


Jan Srzednicki wrote:

Hello,

I have a pair of hosts. One of them performs a massive amount of
TCP connections to the other one, all to the same port. This setup
mostly works fine, but from time to time (that varies, from once a
minute to one a half an hour), the connect(2) syscall fails with 
EADDRINUSE. The connection rate tops to 50 connection


so, what does netstat -aAn show?


initiations/second.

The socket is non-blocking. It does standard job of creating the socket,
setting up the relevant fields, setting SO_REUSEADDR and SO_KEEPALIVE,
setting O_NONBLOCK on the descriptor. No bind(2) is performed. The
connection is initiated from inside a jail (not sure if that implies a
internal bind(2) to the jail's address). There are no connections from
the other host to the first one.

I've tried tuning the net.inet.ip.portrange variables: I've increased
the available portrange to over 45000 ports (quite a lot, should be more
than enough for just anything) and I've toggled
net.inet.ip.portrange.randomized off, but that didn't change anything.

The workaround on the application side - retrying on EADDRINUSE - works
pretty well, but hey, from what I know from the Stevens book, that
shouldn't be happening, though Google said all BSD had a bad habit of
throwing out EADDRINUSE from time to time.

This all happens on a 6.2-RELEASE system. The symptoms are easily
reproducable in my environment.

Is there any known fix for that? If there ain't, can it be fixed? :)



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: connect() returns EADDRINUSE during massive host-host conn rate

2007-11-28 Thread Julian Elischer


Jan Srzednicki wrote:

On Wed, Nov 28, 2007 at 10:22:08AM -0800, Julian Elischer wrote:

Jan Srzednicki wrote:

Hello,
I have a pair of hosts. One of them performs a massive amount of
TCP connections to the other one, all to the same port. This setup
mostly works fine, but from time to time (that varies, from once a
minute to one a half an hour), the connect(2) syscall fails with 
EADDRINUSE. The connection rate tops to 50 connection

so, what does netstat -aAn show?


How can I get any usable information from netstat? It shows a bunch of
connections, of course, but since connect(2) failed, I have no idea what
local port I was trying to use.

but you can get an idea of the local socket distribution, and what state all
the sockets are in  (TIME_WAIT etc).



And, what I forgot to mention, it's a SMP box, which could matter in
case of some race condition.


hopefully not.







___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Test changes to em

2007-11-02 Thread Julian Elischer


Mike Tancsa wrote:

At 12:33 PM 11/2/2007, Jack Vogel wrote:

So at this point I'm unclear, with my reposting of if_em.c last
night has everyone seen both parts or do I have to try something
else?



I never saw a .c file

so put them in ~/public_html on freefall
and they can be accessed as:

http://people.freebsd.org/~(yourlogin)/(filename)


Seems to work. I grabbed it from the mailing list archive off 
www.freebsd.org


possibly the archive gets it before stripping?



http://lists.freebsd.org/pipermail/freebsd-stable/2007-November/037936.html


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ipfw add pipe broken?

2007-04-03 Thread Julian Elischer


JoaoBR wrote:

On Sunday 01 April 2007 15:22, Mike Tancsa wrote:

At 11:55 AM 4/1/2007, JoaoBR wrote:

by all respect to Julians work but with ipfw broken and sunday fucked up
...

kind of scaring when seeing I have no time to check, I do it on tuesday
or I need to do the userland ipfw too to add some new features, but, not
today..
please do it all or don't do it, ipfw is an mature and essential
part where we
do not espect such sudden surprises in releng6 to happen

I seriously doubt he intentionally meant to break it Accidents


man sure not, no one said that



I thought I had these MFC's all worked out and one seems to have some how been 
mixed up in the testing.

Unfortunately the timing really sucked as I was just heading out
when I found that out and I didn't have time to figure out exactly what
went wrong in the MFC. I just backed out what appeared to be the problem commit.
turns out there was another part of it.
This is the missing change that was buried on another commit.

Apparently, the following is also needed, to revert correctly.
--- src/sys/netinet/ip_fw2.c.orig   Mon Apr  2 11:48:03 2007
+++ src/sys/netinet/ip_fw2.cMon Nov 20 18:19:10 2006
@@ -3861,7 +3836,7 @@

   case O_PIPE:
   case O_QUEUE:
-   if (cmdlen != F_INSN_SIZE(ipfw_insn))
+   if (cmdlen != F_INSN_SIZE(ipfw_insn_pipe))
   goto bad_size;
   goto check_action;

I'll commit this and then forward change the MFC again as soon as I find out 
where
the ball was dropped in the MFC testing.







happen.  Roll your sources back to Friday and you will be OK until


yaya

but essential and especially mature code should be tested before comitting 
changes I guess, I believe that ipfw wasn't tested before beeing hacked and 
comitted this time, and overall btw, there was an alert and reply to the 
commit msg on cvs which then was politly ignored until tuesday ... luck 
that it wasn't the bootstrap or something



its sorted out.  Remember, its a best effort, not perfect effort project.


sure, but when became perfect the honor is welcome as it comes for free when 
it went wrong ;)



 ---Mike




João







A mensagem foi scaneada pelo sistema de e-mail e pode ser considerada segura.
Service fornecido pelo Datacenter Matik  https://datacenter.matik.com.br


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ipfw add pipe broken in RELENG_6

2007-04-03 Thread Julian Elischer


Mike Tancsa wrote:

At 10:07 AM 4/1/2007, JoaoBR wrote:


it seems I can not add pipes with releng6 sources from the last days

ipfw add pipe 1 ip from any to any
ipfw: getsockopt(IP_FW_ADD): Invalid argument


I think this is whats needed in /usr/src/sbin/ipfw.  Looking at the 
diffs between HEAD and RELENG_6 (apart from the kernel nat stuff), below 
seems to be whats different.


somewhere between my MFC testing and the commits there seems to have been a 
screwup.

I think it happenned because I reverted a MFC out of my list of MFC's to do 
after
I had done some tests because they causled a failure and I hadn't realised that
they affected this code too.

I'm doing testing now and should be able to confirm this in a short while.






[smicro1U]# diff -u ipfw2.c.orig ipfw2.c
--- ipfw2.c.origMon Apr  2 22:28:33 2007
+++ ipfw2.c Mon Apr  2 22:30:45 2007
@@ -3973,11 +3973,9 @@
break;

case TOK_QUEUE:
-   action-len = F_INSN_SIZE(ipfw_insn_pipe);
action-opcode = O_QUEUE;
goto chkarg;
case TOK_PIPE:
-   action-len = F_INSN_SIZE(ipfw_insn_pipe);
action-opcode = O_PIPE;
goto chkarg;
case TOK_SKIPTO:
@@ -4043,11 +4041,13 @@
illegal forwarding port ``%s'', s);
p-sa.sin_port = (u_short)i;
}
-   lookup_host(*av, (p-sa.sin_addr));
-   }
+   if (_substrcmp(*av, tablearg) == 0)
+   p-sa.sin_addr.s_addr = INADDR_ANY;
+   else
+   lookup_host(*av, (p-sa.sin_addr));
ac--; av++;
break;
-
+}
case TOK_COMMENT:
/* pretend it is a 'count' rule followed by the comment */
action-opcode = O_COUNT;
[smicro1U]#



The command seems to be getting tripped up in /usr/src/sys/netinet/ip_fw2.c

case O_QUEUE:
if (cmdlen != F_INSN_SIZE(ipfw_insn))
goto bad_size;
goto check_action;

where size=2 and cmdlen=1 on opcode=50

---Mike


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ipfw add pipe broken in RELENG_6

2007-04-03 Thread Julian Elischer


Andrey V. Elsukov wrote:

Julian Elischer пишет:
somewhere between my MFC testing and the commits there seems to have 
been a screwup.
I think it happenned because I reverted a MFC out of my list of MFC's 
to do after
I had done some tests because they causled a failure and I hadn't 
realised that

they affected this code too.

I'm doing testing now and should be able to confirm this in a short 
while.


Hi, Julian!

Seems, that converting from time_second to time_uptime broke
`ipfw -t show'.
Now we have one PR:
http://www.freebsd.org/cgi/query-pr.cgi?pr=77

And i see similar problem on the my CURRENT.



yeah I seem to have MFC'd a BUG..
I confirmed they behaved the same. but I was looking at binary data
and didn't notice the date..

I'm preparing a revert, however it is a no-win situation as that leaves you open to 
sessions timing out immediately when the time is changed forward.


So, which is more important?
accurate timeouts or accurate reporting?

Ont thing that COULD be done would be to add the boot-time to the reported
times. this would never let a session time out too early, but would give 
slightly misleading report numbers if the time is adjusted. They would 
'adjust' to the same offset against the 'new time'.
i.e. if it shows 2 hours before now before the change it will show 2 
hours before the new now after the time is changed. This may be acceptable.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ipfw add pipe broken in RELENG_6

2007-04-03 Thread Julian Elischer


Julian Elischer wrote:

Andrey V. Elsukov wrote:

Julian Elischer пишет:
somewhere between my MFC testing and the commits there seems to have 
been a screwup.
I think it happenned because I reverted a MFC out of my list of MFC's 
to do after
I had done some tests because they causled a failure and I hadn't 
realised that

they affected this code too.

I'm doing testing now and should be able to confirm this in a short 
while.


Hi, Julian!

Seems, that converting from time_second to time_uptime broke
`ipfw -t show'.
Now we have one PR:
http://www.freebsd.org/cgi/query-pr.cgi?pr=77

And i see similar problem on the my CURRENT.


I have committed a fix in -current
please check ip_fw2.c version 1.162
If this meets with general happiness I will re-MFC this 1.112 
along with this fix to it.







yeah I seem to have MFC'd a BUG..
I confirmed they behaved the same. but I was looking at binary data
and didn't notice the date..

I'm preparing a revert, however it is a no-win situation as that leaves 
you open to sessions timing out immediately when the time is changed 
forward.


So, which is more important?
accurate timeouts or accurate reporting?

Ont thing that COULD be done would be to add the boot-time to the reported
times. this would never let a session time out too early, but would give 
slightly misleading report numbers if the time is adjusted. They would 
'adjust' to the same offset against the 'new time'.
i.e. if it shows 2 hours before now before the change it will show 2 
hours before the new now after the time is changed. This may be 
acceptable.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: [fbsd] HEADS UP: FreeBSD 5.3, 5.4, 6.0 EoLs coming soon

2006-10-11 Thread Julian Elischer


Jeremie Le Hen wrote:

Hi,

On Sun, Oct 01, 2006 at 12:30:22AM -0700, FreeBSD Security Officer wrote:
  

Users of FreeBSD 4.11 systems are also reminded that that FreeBSD 4.11
will reach its End of Life at the end of January 2007 and that they
should be making plans to upgrade or replace such systems.



Though I admit RELENG_4 is getting dusty, it is not rusty.  I believe it
is still used in many places because of its stability and performance.

For instance, according to Julian Elischer's posts, it seems he is still
working on it.
  


Weeell, we (Ironport) just moved to 6.1 but my previous employer 
(Vicor) is still using it.



Is it envisageable to extend the RELENG_4's and RELENG_4_11's EoL once
more ?

Thank you.
Regards,
  

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: panic: vm_thread_new: kstack allocation failed

2006-09-01 Thread Julian Elischer


Vyacheslav Vovk wrote:

can you see how many threads thre are in the system?
I think you will have to extract this information frome the zone allocator.

I just realised there is no effective limit on kernel threads in the system.
probably one could cause this with a fork bomb appoach using forks and 
thread creation.



Unread portion of the kernel message buffer:
panic: vm_thread_new: kstack allocation failed
cpuid = 3
Uptime: 7d4h30m58s

 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: panic: vm_thread_new: kstack allocation failed

2006-09-01 Thread Julian Elischer


Kip Macy wrote:


I've seen this when running stress2 with a large number of
incarnations. Why don't we return an error to the user?



programmer ENOTIME

patch welcome!



-Kip

On 9/1/06, Julian Elischer [EMAIL PROTECTED] wrote:


Vyacheslav Vovk wrote:

can you see how many threads thre are in the system?
I think you will have to extract this information frome the zone 
allocator.


I just realised there is no effective limit on kernel threads in the 
system.

probably one could cause this with a fork bomb appoach using forks and
thread creation.

Unread portion of the kernel message buffer:
panic: vm_thread_new: kstack allocation failed
cpuid = 3
Uptime: 7d4h30m58s



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 
[EMAIL PROTECTED]



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-threads
To unsubscribe, send any mail to 
[EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Monitoring temperature with acpi (sysctls)

2006-07-27 Thread Julian Elischer


Someone mentionned that you can't reach the smbus on ASUS boards.
That's because they turn it off in the BIOS. They turn it on and off as 
they

need to read stuff for their SMI (well on some of their boards at least).

you can turn it on again using pciconf. but I forget the exact incantation.

(I've asked someone to send me the script so I'll have it later if 
anyone wants it)


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: FreeBSD 6.1 Released

2006-05-08 Thread Julian Elischer


Scott Long wrote:




-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It is my great pleasure and privilege to announce the availability of
FreeBSD 6.1-RELEASE.  This release is the next step in the development
of the 6.X branch, delivering several performance improvements, many
bugfixes, and a few new features.  These include:

~ Addition of a keyboard multiplexer.  This allows USB and PS/2 keyboards
 to coexist without any special options at boot.
~ Many fixes for filesystem stability.  High load stress tests are now run
 successfully on a regular basis as part of the normal FreeBSD QA process.
~ Automatic configuration for man Bluetooth devices, as well as automatic

 



  s/man/many


 support for running WiFi access points.
~ Addition of drivers for new ethernet and SAS and SATA RAID controllers.
~ BIND updated to 9.3.2
~ sendmail updated to 8.13.6

NOTE: It was discovered at the last minute that the errata notes that were
packaged with the release are out of date.  For a complete list of known
problems, please see the online errata list, available at:

   http://www.FreeBSD.org/releases/6.1R/errata.html
 



the above points to a filel that says 6.0 errata

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: new feature: private IPC for every jail

2006-04-04 Thread Julian Elischer


Robert Watson wrote:



On Tue, 4 Apr 2006, Peter Jeremy wrote:


On Mon, 2006-Apr-03 16:34:59 +0100, Robert Watson wrote:

(2) The name space model for system v ipc is flat, so while it's 
desirable

to
   allow the administrator in the host environment to monitor and 
control

   resource use in the jail (for example, delete allocated but unused
   segments), doing that requires developing an administrative model 
for

   it.



The SysV SHM name space is made up of a 32-bit user-selected key 
which is mapped into a 32-bit (system chosen) identifier, which (on 
FreeBSD) is made up of a 16-bit pool identifier (in the range 
0..shmmni-1) and a 16-bit generation counter.


At the expense of restricting shmmni, the generation counter and 
JAIL_MAX, it would seem possible to embed prison.pr_id into the shmid 
and treat pr_id as an (implicit) part of the key - insisting they 
must match for jailed processes.  Since the name space remains the 
same, ipcs and ipcrm would not be affected and a non-jailed ipcrm 
could delete jailed IPC by identifier.


On the surface, this approach looks easier than having a distinct 
name space associated with each prison (as per kern/48471) and has 
the advantage of allowing non-jailed processes to manage jailed IPC. 
The disadvantage is restricting the ranges of various counters - 
though I believe they are overly generous by default.


This doesn't really address the problem of SysV IPC and jails 
becoming more intimately entwined.



Hmm.  This sounds like it might be workable.  To make sure I 
understand your proposal:


- We add a new prison ID field to the in-kernel description of each 
segment,
  semaphore, message queue, etc.  This is initialized to the prison ID 
of the

  process creating the object at the time of creation.

- shmget(), et al, will, in addition to matching the key when 
searching for an
  existing object, will also attempt to match the prison ID of the 
object to

  the process.  For the sake of completeness, we will use prison ID 0 for
  unjailed processes (or something along those lines).  This 
guarantees that
  two jails, or even the host and a jail, will never receive an ID 
already
  allocated to another jail, and in particular, not an ID for an 
object from

  another jail with the same key as might be used in the current jail.



what if a host wants to communicate with a jail?
does it make sense?
at teh moment a host ca see into a jail inmany ways.. (filesystem, 
sockets, process space etc.)




- shmat(), et al, will perform an access control check to confirm that 
if a

  process is jailed, its prison ID matches that of the object.

Is it necessary, as you suggest, to change the IPC ID name space at 
all?  I assume applications do consistently use shmget() to look up 
IDs, and that they can't/don't make assumptions about long-term 
persistence of those mappings across boot (which is effectively what a 
jail restart is?  Is the behavior of IPXSEQ_TO_IPCID() something that 
has documented or relied on properties, or are we free to perform a 
mapping from a name (key) to an object (id) in any way we choose?


I guess another change is also needed:

- At jail termination, we GC all resources with the prison ID in 
question.


This prevents a future jail from turning up with the same ID and 
seeing old shared memory (etc) segments.


Robert N M Watson
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
[EMAIL PROTECTED]



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: new feature: private IPC for every jail

2006-04-03 Thread Julian Elischer


Robert Watson wrote:



On Mon, 3 Apr 2006, Marc G. Fournier wrote:


http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/48471

[kernel] [patch] new feature: private IPC for every jail

Its an ancient, 4.x patch for having private IPC in a jail ... not 
sure how hard it would be to bring it up to 6.x / -current standards 
though ... but it seems like something 'good' that is needed ...



In the past I've looked at doing things along these lines, but usually 
stall after a first hack when trying to decide how to deal with two 
critical issues:


(1) The fact that system v ipc primitives are loadable, and 
unloadable, which

requires some careful handling relating to registration order, etc.



this is related to the problem that needs to be solved for getting 
vimage into -current.




(2) The name space model for system v ipc is flat, so while it's 
desirable to
allow the administrator in the host environment to monitor and 
control

resource use in the jail (for example, delete allocated but unused
segments), doing that requires developing an administrative model 
for it.



it is possible the admin environment can't see it.
unless you prefix it with something..



These challenges can be surmounted, but the doing them in a nice way 
requires some thought.


Robert N M Watson
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
[EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: freebsd-stable Digest, Vol 120, Issue 5

2005-07-28 Thread Julian Elischer






has anyone used IPMI (BMC) with x346 and FreeBSD successful?

Today i've configured with bmc_cfg.exe all BMC options and when server
is in DOS mode i can log in to it and reboot it. But when FBSD stand
up connection to BMC is lost.
In IBM help desk some technicans told me: BMC works only on linux and
windows because on BSD IBM can't earn money. But i don't belive it,
i've searched google for IPMI for FBSD and found two projects:
1. http://sourceforge.net/projects/ipmi-bsd/ - but it stopped in July 18, 2002.
2. http://www.gnu.org/software/freeipmi/ - it looks like it stopped in
2004-10-25.

Or mabye there are some other project? If anybody of you uses IPMI on
FBSD please share your knowledge.

   


Sorry, I can't help you with IPMI on FreeBSD, but ...

Sorry, I can't help you with IPMI on FreeBSD, but ...
 



I have run the openIPMI code on freeBSD with success.
But only on the Intell servers.

The intel servers use a versin of the intel ethernet chips that
have a back door into them specifically for the BMC to use
and work pretty much regardless of whether the OS is using them or not.

The broadcom chips as used in the x386 don't seem to have this and if 
the OS resets
the chip the BMC loses access to it. What is needed is some way to tell 
the bge (I think
it is the gig version, if not the bfe) driver to leave the chip alone if 
it finds it.


one clue may be see if it's the probe that does it or does it just stop 
working when you

ifconfig the interface?


Or mabye there is something else which can help me if system hangs to
reboot machine.

   


Of course there are other methods of accessing the box if it has a problem.
Take an old fashioned console server. Deactivate the IPMI stuff in your 
xSeries and search in the BIOS for redirecting the BIOS to serial Port.
Actually you can do everything with a good console server from remote 
(I'd suggest cyclades console servers).
Additionally you can connect a Power Switch to the console server on one 
side and on the other the xSeries to this manageble power strip.
I used the cyclades console server + cyclades power strip and can do an 
easy ssh to the serial port of the server. If the server crashes _hard_ 
I can do a CTRL+p on the console and get a power management menu.

Then I just say Power Off and Power On and off the server goes :)

So... why IPMI ?
 




IPMI lets you power down the server too.
the IPM servers with the broadcom chips are however a
problem that has yet to be solved.  Linux somehow knows not to reset the 
device

but I have not looked into how it knows this.



hope that helps a bit ;)
- Marian


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Fw: Re: Massive sound changes / fix (24/32bit pcm support, new sampling rate converter, various fixes)

2005-07-09 Thread Julian Elischer


sebastian ssmoller wrote:

just FYI ...

regards,
seb

Begin forwarded message:

Date: Sat, 9 Jul 2005 10:24:57 +0200
From: sebastian ssmoller [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Re: Massive sound changes / fix (24/32bit pcm support, new
sampling rate converter, various fixes)


hi,
i just wonna say: THX! really GREAT work! ... this improves sound
quality on my boxes much !!   ;-)

THX,
regards,
seb



After sometimes, I've decided to release this (massive 4k lines) diff
to our sound driver. This need proper review and confirmation, before
it can be committed.

Patches for both HEAD / RELENG_5 available at:



How do these changes affect the recently submitted changes to improve OSS 
compatibility?

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: WAS: FreeBSD MySQL still WAY slower than Linux

2005-06-17 Thread Julian Elischer





Wilko Bulte [EMAIL PROTECTED] writes:

 


If you give me $5 per Unix system found there I can retire here and now.
   

For financial transaction processing, and the customer's accounts? I 
hope it's not my bank.. mkb.



Hmmm we processed something over a trillion dollars in bank backends 
last year on

FreeBSD 4.8 (plus patches)  on rack mounted PCs.
And we didn't lose any of them (the dollars that is).




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: USB changes.

2005-04-28 Thread Julian Elischer


+++
Currently, I find my P4 hanging just after discovering the parallel
port and mounting disk; in other words, just here:
ppc0: parallel port not found.
ad0: 28629MB ST330620A [58168/16/63] at ata0-master UDMA100
but -only- when my Logitech USB mouse is plugged in. Now, if I
unplug it and hit reset (not Ctrl-Alt-Del; the keyboard is frozen) the
system boots, and I can obtain X w/ a functional mouse. Yesterday, of
course, prior to the USB change the system did not hang.
I saw this problem in testing ad THOUGHT I had checked in the fix..
Can you confirm that usb.c ends with:
 SYSINIT(usb_cold_explore, SI_SUB_INT_CONFIG_HOOKS, SI_ORDER_FIRST,
   usb_cold_explore, NULL);

I noticed that some code was changed in between my discovery of the
hanging and my attempt to fix it:
Apr 27 21:15 subr_bus.c
but this change, and the subsequent world update, did not solve the
issue of the hanging mouse.
the changes that you are refering to include some to defer probing of the 
USB 1.1 busses untill after the USB2.0 busses have been configured.
They should probe for the devices at around the same time that the scsi 
devices probe.
I'll see if I can duplicate yuor problem.. I tested with several USB 1.1
devices but a mounse was not amongst them.


See, I know I've only myself to blame for missing the announcement
and/or starting an upgrade in an interstice between stability and
apparent instability. But still...test first, deploy later, perhaps?
Kernel bits, had them for a couple of years now:
device  uhci
device  ohci
device  ehci
device  usb
device  ugen
device  ums
device  uscanner
With my luck, it's probably not the mouse after all.
I assume it works right if you remove the mouse before booting and reinsert 
it after the kernel has booted?


 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: USB changes.

2005-04-28 Thread Julian Elischer


Julian Elischer wrote:

+++
Currently, I find my P4 hanging just after discovering the parallel
port and mounting disk; in other words, just here:
but -only- when my Logitech USB mouse is plugged in. Now, if I
unplug it and hit reset (not Ctrl-Alt-Del; the keyboard is frozen) the
system boots, and I can obtain X w/ a functional mouse. Yesterday, of
course, prior to the USB change the system did not hang.

try plugging the mouse in again before X starts.

I saw this problem in testing ad THOUGHT I had checked in the fix..
Can you confirm that usb.c ends with:
 SYSINIT(usb_cold_explore, SI_SUB_INT_CONFIG_HOOKS, SI_ORDER_FIRST,
   usb_cold_explore, NULL);


I noticed that some code was changed in between my discovery of the
hanging and my attempt to fix it:
Apr 27 21:15 subr_bus.c
but this change, and the subsequent world update, did not solve the
issue of the hanging mouse.

the changes that you are refering to include some to defer probing of 
the USB 1.1 busses untill after the USB2.0 busses have been configured.
They should probe for the devices at around the same time that the 
scsi devices probe.
I'll see if I can duplicate yuor problem.. I tested with several USB 1.1
devices but a mounse was not amongst them.

I tried to duplicate this but failed.. my mose was found just fine..
can you boot with  the -v option?
i.e.   boot -v fromt eh loader prompt.


See, I know I've only myself to blame for missing the announcement
and/or starting an upgrade in an interstice between stability and
apparent instability. But still...test first, deploy later, perhaps?
Kernel bits, had them for a couple of years now:
device  uhci
device  ohci
device  ehci
device  usb
device  ugen
device  ums
device  uscanner
With my luck, it's probably not the mouse after all.

what about if you don't have the ums driver loaded?
I assume it works right if you remove the mouse before booting and 
reinsert it after the kernel has booted?

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: USB changes.

2005-04-28 Thread Julian Elischer


Joe Altman wrote:
On Thu, Apr 28, 2005 at 11:23:14AM -0700, Julian Elischer wrote:
 

+++
Currently, I find my P4 hanging just after discovering the parallel
port and mounting disk; in other words, just here:
ppc0: parallel port not found.
ad0: 28629MB ST330620A [58168/16/63] at ata0-master UDMA100
but -only- when my Logitech USB mouse is plugged in. Now, if I
unplug it and hit reset (not Ctrl-Alt-Del; the keyboard is frozen) the
system boots, and I can obtain X w/ a functional mouse. Yesterday, of
course, prior to the USB change the system did not hang.
 

I saw this problem in testing ad THOUGHT I had checked in the fix..
Can you confirm that usb.c ends with:
SYSINIT(usb_cold_explore, SI_SUB_INT_CONFIG_HOOKS, SI_ORDER_FIRST,
  usb_cold_explore, NULL);
   

Hmmm...
/usr/src/lib/libusbhid/
Nope..
 

/usr/sr/sys/dev/usb/usb.c
Aha...maybe this is it:
/usr/src/usr.sbin/usbd/
 

nope
ll /usr/src/usr.sbin/usbd/
2734320 -rw-r--r--1 root  wheel169 Apr 25  2001 Makefile
2738254 -rw-r--r--1 root  wheel   4460 Jun 22  2003 usbd.8
2736938 -rw-r--r--1 root  wheel  30079 Nov 29  2003 usbd.c
2737339 -rw-r--r--1 root  wheel   5046 Aug 27  2004 usbd.conf.5
Is that the one?
Here is a part of 'tail -25' on the file, showing the bottom:
/* check the event queue */
   if (handle_events  (FD_ISSET(fd, r) || error == 0))
   {
   if (verbose = 2)
   printf(%s: processing event queue
   %son %s\n,
   __progname,
  (error? :due to timeout ),
   USBDEV);
   process_event_queue(fd);
   }
   }
}
So no, this file doesn't end in what you ask; I can't see anywhere
else it might live; is there some other usbd.c you need?
 

I noticed that some code was changed in between my discovery of the
hanging and my attempt to fix it:
Apr 27 21:15 subr_bus.c
but this change, and the subsequent world update, did not solve the
issue of the hanging mouse.
 

the changes that you are refering to include some to defer probing of the 
USB 1.1 busses untill after the USB2.0 busses have been configured.
   

Ah...okay; well, I was in the dark anyway, may as well whistle.
 

theoretically it may be that the usb code doesn't like being run at that 
time..
I'll try duplicate your setup more closely.

is it a uhci or ohci controller?
I just realised I did most of my testing with ohci..
will hunt down a uhci machine to test with.
 

I'll see if I can duplicate yuor problem.. I tested with several USB 1.1
devices but a mounse was not amongst them.
   

Okay; it may be a one off, relative to me only, as I've seen no other
indications of issues. I am behind on my list reading, though.
 

I assume it works right if you remove the mouse before booting and reinsert 
it after the kernel has booted?
   

Yes; once I have a console, the mouse is detected:
ums0: Logitech USB Receiver, rev 1.10/23.02, addr 2, iclass 3/1
ums0: 7 buttons and Z dir.
Then, it works in X just fine; I forgot to test it on the console.
 

shouldn't matter..
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: USB changes.

2005-04-28 Thread Julian Elischer


Joe Altman wrote:
On Thu, Apr 28, 2005 at 11:23:14AM -0700, Julian Elischer wrote:
 


[...]
 

I assume it works right if you remove the mouse before booting and reinsert 
it after the kernel has booted?
   

Yes; once I have a console, the mouse is detected:
ums0: Logitech USB Receiver, rev 1.10/23.02, addr 2, iclass 3/1
ums0: 7 buttons and Z dir.
Then, it works in X just fine; I forgot to test it on the console.
 

seems to work for me with uhci too..
can you see if changing the BIOS usb optiosn makes a difference?
[...]
uhci0: UHCI (generic) USB controller port 0xe800-0xe81f irq 5 at 
device 29.0 o n pci0
usb0: UHCI (generic) USB controller on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: UHCI (generic) USB controller port 0xec00-0xec1f irq 9 at 
device 29.1 o n pci0
usb1: UHCI (generic) USB controller on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pci0: unknown card (vendor=0x8086, dev=0x25ab) at 29.4
pci0: unknown card (vendor=0x8086, dev=0x25ac) at 29.5
ehci0: EHCI (generic) USB 2.0 controller mem 0xfe7ffc00-0xfe7f irq 
7 at de vice 29.7 on pci0
usb2: EHCI version 1.0
usb2: companion controllers, 2 ports each: usb0 usb1
usb2: EHCI (generic) USB 2.0 controller on ehci0
usb2: USB revision 2.0
uhub2: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub2: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub2: single transaction translator
uhub2: 4 ports with 4 removable, self powered
[...]
psm0: PS/2 Mouse irq 12 on atkbdc0
psm0: model IntelliMouse, device ID 3
vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0
sc0: System console at flags 0x100 on isa0
sc0: VGA 16 virtual consoles, flags=0x100
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A, console
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: parallel port not found.
DUMMYNET initialized (011031)
IP packet filtering initialized, divert enabled, rule-based forwarding 
enabled, default to accept, logging disabled
IPsec: Initialized Security Association Processing.
ad0: 114473MB ST3120026A [232581/16/63] at ata0-master UDMA100
ums0: Microsoft Microsoft 3-Button Mouse with IntelliEye(TM), rev 
1.10/3.00, add r 2, iclass 3/1
ums0: 3 buttons and Z dir.
Mounting root from ufs:/dev/ad0s1a


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: USB changes.

2005-04-28 Thread Julian Elischer


Joe Altman wrote:
On Thu, Apr 28, 2005 at 12:55:31PM -0700, Julian Elischer wrote:
 

Can you confirm that usb.c ends with:
SYSINIT(usb_cold_explore, SI_SUB_INT_CONFIG_HOOKS, SI_ORDER_FIRST,
usb_cold_explore, NULL);
   

/usr/src/sys/dev/usb/usb.c
   

^^^
Where does the file that you need live?
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: USB changes.

2005-04-28 Thread Julian Elischer

Since I can't make this happen here, I'm going to need help..
Joe Altman wrote:
Apr 27 00:07 /usr/src/sys/dev/usb/usb.c
/* Explore USB busses at the end of device configuration. */
Static void
usb_cold_explore(void *arg)
{
   struct usb_softc *sc;
 

can you add this line here:
   printf(HEY WE GOT HERE!\n);
   KASSERT(cold || TAILQ_EMPTY(usb_coldexplist),
   (usb_cold_explore: busses to explore when !cold));
   while (!TAILQ_EMPTY(usb_coldexplist)) {
   sc = TAILQ_FIRST(usb_coldexplist);
   TAILQ_REMOVE(usb_coldexplist, sc, sc_coldexplist);
 

and:
printf(probing a USB 1.1 bus.\n);
   sc-sc_bus-use_polling++;
   sc-sc_port.device-hub-explore(sc-sc_bus-root_hub);
   sc-sc_bus-use_polling--;
   }
}
DRIVER_MODULE(usb, ohci, usb_driver, usb_devclass, 0, 0);
DRIVER_MODULE(usb, uhci, usb_driver, usb_devclass, 0, 0);
DRIVER_MODULE(usb, ehci, usb_driver, usb_devclass, 0, 0);
SYSINIT(usb_cold_explore, SI_SUB_INT_CONFIG_HOOKS, SI_ORDER_FIRST,
   usb_cold_explore, NULL);
#endif
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: USB changes.

2005-04-28 Thread Julian Elischer


Joe Altman wrote:
On Thu, Apr 28, 2005 at 01:40:50PM -0700, Julian Elischer wrote:
 

Since I can't make this happen here, I'm going to need help..
   

What do you need? Relatively speaking, I'm sort of a newbie at the
more complex aspects of bugs, and this strikes me as somewhat complex.
But if you are willing to tell me what you need, and how to proceed or
what to read to help you, I'll do what I can.
 

I asked you to add 2 lines to that file and try again.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

more MFCs to RELENG_4

2005-04-27 Thread Julian Elischer

You'll find a diff at:
http://www.freebsd.org/~julian/usb-4.diff
This merges a lot of the USB infrastructure.
I was amazed how little changing had to be done
to allow this to work on 4.x.
The files become almost teh same as on -current.
(minus small changes here and there)
if you use USB on 4.x, please check this out and let me know if I have 
broken
anything.

Especially try mistreating it (if it takes such mistreatment in 4.x
at the moment) to see if we've broken error handling etc.
I'll commit this in a day or so and keep going.. hubs next I think.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: more MFCs to RELENG_4

2005-04-27 Thread Julian Elischer


Julian Elischer wrote:
You'll find a diff at:
http://www.freebsd.org/~julian/usb-4.diff
This merges a lot of the USB infrastructure.
I was amazed how little changing had to be done
to allow this to work on 4.x.
The files become almost teh same as on -current.
(minus small changes here and there)
if you use USB on 4.x, please check this out and let me know if I have 
broken
anything.

That doesn't mean I haven't tried it, just that more testers is better..
Especially try mistreating it (if it takes such mistreatment in 4.x
at the moment) to see if we've broken error handling etc.
I'll commit this in a day or so and keep going.. hubs next I think.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to [EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Headsup: USB MFCs to 4.x

2005-04-26 Thread Julian Elischer

I have just merged in most of the lowest level changes to 4.x from 6.x.
If you use USB on 4.x machines and are planning on following RELENG_4
then I suggest you test the latest sources to avoid nasty surprises later.
I will probably merge some of the actual device drivers as well, though 
I have
limited resources for testing them..

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

uaudio MFCs to 4.x

2005-04-22 Thread Julian Elischer

I have just merged a set of changes from current to RELENG_4 for usb audio.
USB audio in 4.x was not terribly well supported anyhow, but if you are 
running
RELENG_4 and have a usb audio output device (input may not work,  yet) 
you might
like to do a  before and after test.I have limited USB audio 
output equipment,
and it didn't work on a lot of devices prior to these changes but I'd 
like to
hear if it got worse

:-)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

anyone had freeipmi runnning under 4.8?

2005-02-23 Thread Julian Elischer

Well to be more precise.. under a 4.8-level set of other packages.
you need guile 1.6 and that pulls in half of gnome (why?)
and tons of othe rstuff and in teh end nothing seems
to compile/work any more :-/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: update for 4.11 Security Officer-supported branches

2005-01-12 Thread Julian Elischer


Barry Bouwsma wrote:
[thread hijacked from freebsd-security@ and landed in stable@ ]
On Mon, 10 Jan 2005 11:16:34 -0800, Julian Elischer wrote:
 

While a 4.12 will PROBABLY not happen, I do plan on continued MFCs of 
important
changes to RELENG_4 as I do not envision my custommers moving to 5.x 
until some
time in 2006 at the earliest. (Including fixes from dragonfly, and 
possibly some new drivers
and thing like USB fixes.
   

Thanks, Julian, and I'd like to contemplate getting a few of my MFC
hacks that I'm running on 4.x, added to releng, if possible.
One of these things I've found very handy is the `mount' option
introduced in 5.x that allows one to specify with -F an alternate
to /etc/fstab to be used.
we can probably MFC anything that doesn't break any existing ABI.
This I use for conditional mounting of devices which may or may
not be present, like external USB (to catch your interest, heh)
drives with a complicated partition layout.  I test if a known
drive is present at a certain device, and if so,
`mount -F /etc/fstab-da0' or `-da1' and so on, as part of my boot.
In order to adopt this change, I had to add to libc in 4.x as well
as butcher the `mount' code.  Does such a change stand a chance of
being added to 4.x, or are infrastructure changes required this way,
like to libc, off-limits outside my own hive of personal hackery?
 

libc is not usually a target in a legacy branch. Addign a new function to it
may be ok in some cases but probably best to not do it.
(What's missing, as far as I know, that could be handy, is a
comparable option to `umount' when one wants to quickly detach a
drive with ten mounted filesystems.  I haven't looked at this idea)
(freebsd-legacy@ , anyone, for those of us too stubborn to join the
modern world, and get confused when the -stable list postings don't
make clear what release is being discussed, or want a quiet place to
mull over 2.2.x ?)
I was considerring this..
when 4.11 dies, we may make such a list.

thanks
barry bouwsma
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

anyone using the umodem(4) module/driver?

2005-01-08 Thread Julian Elischer

If so I'd like to hear success/failure reports..
looking at closing or otherwise acting on:
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/39341
however the original submitter has dissappeared.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

1 2 >

1 - 100 of 124 matches

Mail list logo