Re: Stop scheduler on panic

2011-12-06 Thread Andriy Gapon
on 07/12/2011 00:11 Attilio Rao said the following:
> I'd just change this check on panicstr:
> @@ -606,9 +603,13 @@ kdb_trap(int type, int code, struct trapframe *tf)
>   intr = intr_disable();
> 
>  #ifdef SMP
> - other_cpus = all_cpus;
> - CPU_CLR(PCPU_GET(cpuid), &other_cpus);
> - stop_cpus_hard(other_cpus);
> + if (panicstr == NULL) {
> + other_cpus = all_cpus;
> + CPU_CLR(PCPU_GET(cpuid), &other_cpus);
> + stop_cpus_hard(other_cpus);
> + did_stop_cpus = 1;
> + } else
> + did_stop_cpus = 0;
> 
> to be SCHEDULER_STOPPED().

Makes sense.  I will do this.

> If you agree I can fix the kern_mutex, kern_sx and kern_rwlock parts
> and it should be done.

Since I am not very familiar with the details of that code, I can not be against
such a proposal :-)  What Kostik did seemed quite reasonable to me, but if that
can be further improved, then I am all for it.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD 10.0-CURRENT/AMD64 (CLANG): lang/gcc46 fails to build

2011-12-06 Thread Steve Kargl
On Wed, Dec 07, 2011 at 05:56:31AM +0100, O. Hartmann wrote:
> config.status: creating ada/Makefile
> config.status: creating auto-host.h
> config.status: executing default commands
> gmake[2]: Leaving directory `/usr/ports/lang/gcc46/work/build'
> gmake[1]: *** [stage1-bubble] Error 2
> gmake[1]: Leaving directory `/usr/ports/lang/gcc46/work/build'
> gmake: *** [bootstrap-lean] Error 2
> *** Error code 1
> 
> Stop in /usr/ports/lang/gcc46.
> *** Error code 1
> 
> Stop in /usr/ports/lang/gcc46.
> 
> ===>>> make failed for lang/gcc46
> ===>>> Aborting update
> 

See if setting DISABLE_MAKE_JOBSi helps.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


FreeBSD 10.0-CURRENT/AMD64 (CLANG): lang/gcc46 fails to build

2011-12-06 Thread O. Hartmann
Hello.
On FreeBSD 10.0-CURRENT/amd64 I run into the error shown below when
updating the installation of the gcc46 compiler suite.

The OS has been compiled via CLANG, binutils 2.22 are installed and has
been installed either with the UNAME_r settings and WITH_FBSD10_FIX set
in /etc/make.conf.

I was wondering whether others would also see this on CURRENT. On all
FreeBSD 9.0 boxes gcc46 compiles well.

Regards,
Oliver



Configuring stage 1 in ./gcc
clang -O3 -fno-strict-aliasing -pipe -march=native -I/usr/local/include
 -o fixincl fixincl.o fixtests.o fixfixes.o server.o procopen.o fixlib.o
fixopts.o ../libiberty/libiberty.a
echo timestamp > full-stamp
gmake[3]: Leaving directory
`/usr/ports/lang/gcc46/work/build/build-x86_64-portbld-freebsd9.9/fixincludes'
gmake[3]: Entering directory `/usr/ports/lang/gcc46/work/build/libcpp'
clang  -I.././../gcc-4.6-20111202/libcpp -I.
-I.././../gcc-4.6-20111202/libcpp/../include
-I.././../gcc-4.6-20111202/libcpp/include  -g -fkeep-inline-functions -W
-Wall -Wwrite-strings -Wmissing-format-attribute -Wstrict-prototypes
-Wmissing-prototypes -Wold-style-definition -Wc++-compat -pedantic
-Wno-long-long  -I.././../gcc-4.6-20111202/libcpp -I.
-I.././../gcc-4.6-20111202/libcpp/../include
-I.././../gcc-4.6-20111202/libcpp/include  -c -o charset.o -MT charset.o
-MMD -MP -MF .deps/charset.Tpo .././../gcc-4.6-20111202/libcpp/charset.c
.././../gcc-4.6-20111202/libcpp/charset.c:1371:1: error: conflicting
types for 'cpp_interpret_string'
cpp_interpret_string (cpp_reader *pfile, const cpp_string *from, size_t
count,
^
.././../gcc-4.6-20111202/libcpp/include/cpplib.h:742:13: note: previous
declaration is here
extern bool cpp_interpret_string (cpp_reader *,
^
.././../gcc-4.6-20111202/libcpp/charset.c:1452:1: error: conflicting
types for 'cpp_interpret_string_notranslate'
cpp_interpret_string_notranslate (cpp_reader *pfile, const cpp_string *from,
^
.././../gcc-4.6-20111202/libcpp/include/cpplib.h:745:13: note: previous
declaration is here
extern bool cpp_interpret_string_notranslate (cpp_reader *,
^
2 errors generated.
gmake[3]: *** [charset.o] Error 1
gmake[3]: Leaving directory `/usr/ports/lang/gcc46/work/build/libcpp'
gmake[2]: *** [all-stage1-libcpp] Error 2
gmake[2]: *** Waiting for unfinished jobs
configure: creating cache ./config.cache
checking build system type... x86_64-portbld-freebsd9.9
checking host system type... x86_64-portbld-freebsd9.9
checking target system type... x86_64-portbld-freebsd9.9
checking LIBRARY_PATH variable... ok
checking GCC_EXEC_PREFIX variable... ok

[...]


checking linker *_sol2 emulation support... no
checking linker --sysroot support... yes
checking __stack_chk_fail in target C library... checking for
__stack_chk_fail... yes
yes
checking dl_iterate_phdr in target C library... unknown
Using ggc-page for garbage collection.
checking whether to enable maintainer-specific portions of Makefiles... no
Links are now set up to build a native compiler for
x86_64-portbld-freebsd9.9.
checking for exported symbols... yes
checking for -rdynamic... yes
checking for library containing dlopen... none required
checking for -fPIC -shared... yes
configure: updating cache ./config.cache
configure: creating ./config.status
config.status: creating as
config.status: creating collect-ld
config.status: creating nm
config.status: creating Makefile
config.status: creating ada/gcc-interface/Makefile
config.status: creating ada/Makefile
config.status: creating auto-host.h
config.status: executing default commands
gmake[2]: Leaving directory `/usr/ports/lang/gcc46/work/build'
gmake[1]: *** [stage1-bubble] Error 2
gmake[1]: Leaving directory `/usr/ports/lang/gcc46/work/build'
gmake: *** [bootstrap-lean] Error 2
*** Error code 1

Stop in /usr/ports/lang/gcc46.
*** Error code 1

Stop in /usr/ports/lang/gcc46.

===>>> make failed for lang/gcc46
===>>> Aborting update

===>>> Update for lang/gcc46 failed
===>>> Aborting update


===>>> You can restart from the point of failure with this command line:
   portmaster  lang/gcc46



signature.asc
Description: OpenPGP digital signature


Re: binutils-2.22: ld and --copy-dt-needed-entries

2011-12-06 Thread Andrew W. Nosenko
On Tue, Dec 6, 2011 at 23:41, Andriy Gapon  wrote:
> on 06/12/2011 23:24 Martin Matuska said the following:
>> On 6.12.2011 17:48, Andriy Gapon wrote:
>>> Just for your information.
>>> It seems that ld from binutils-2.22 by default has 
>>> --no-copy-dt-needed-entries
>>> behavior, and so explicit --copy-dt-needed-entries is now needed where the
>>> previous default behavior is relied upon.
>>>
>>> A short excerpt from the man page for your convenience:
>>>
 This option also has an effect on the resolution of symbols in
 dynamic libraries.  With --copy-dt-needed-entries dynamic libraries
 mentioned on the command line will be recursively searched,
 following their DT_NEEDED tags to other libraries, in order to
 resolve symbols required by the output binary.  With the default
 setting however the searching of dynamic libraries that follow it
 will stop with the dynamic library itself.  No DT_NEEDED links will
 be traversed to resolve symbols.
>> What do we do with this?
>> We can go back, patch to behave as before or to continue.
>> Are there any serious complaints?
>
> I am not sure.  Eventually all upstreams of our ports will have to deal with
> this.  So far I've encountered only one problematic port (gegl) that links a
> binary with -lglib-2.0 expecting that a required -liconv dependency would be
> automatically picked up via DT_NEEDED.  libglib-2.0.so indeed has a DT_NEEDED
> entry for libiconv.so.  But this dependency is not explicitly advertised via
> pkg-config metadata:
> $ fgrep -i Libs /usr/local/libdata/pkgconfig/glib-2.0.pc
> Libs: -L${libdir} -lglib-2.0
> Libs.private: -liconv
>
> So there could be other issues related to this in the future.
> Perhaps this is actually an issue with glib, maybe it should have -liconv in
> Libs.  I am not really knowledgeable about his stuff.

As far, as I understand the
  http://lists.debian.org/debian-devel-announce/2011/02/msg00011.html ,
  https://wiki.ubuntu.com/NattyNarwhal/ToolchainTransition ,
  http://old.nabble.com/Make-no-copy-dt-needed-default--td32272377.html ,
correctly

1. upstreams (e.g. Glib) had a pretty much time for test this change.

2. If I just use Glib (for example), then all Glib's iconv-related
stuffs will continue to work, I don't need to explicitly add -liconv.
All that fail if I use iconv_open() (for example) directly and
(bypassing Glib) and rely on Glib to load libiconv as side-effect.  Of
courcse, it would be quite wrong from my side because existence of
libconv as an Glib charset conversion engine is an implementation
detail that may change at the some day or just because of different
configuration options.  Just like GnuTLS swtiched from libgcrypt to
libnettle.

3. Of course, something may fail, but I would not to expect a big
amount of failures (due to the fact that major Linux distros already
there)

-- 
Andrew W. Nosenko 
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Stop scheduler on panic

2011-12-06 Thread Attilio Rao
2011/12/6 Andriy Gapon :
> on 06/12/2011 20:34 Attilio Rao said the following:
> [snip]
>> - I'm not entirely sure, why we want to disable interrupts at this
>> moment (before to stop other CPUs)?:
>
> Because I believe that stop_cpus_hard() should run in a context with 
> interrupts
> and preemption disabled.  Also, I believe that the whole panic handling  code
> should run in the same context.  So it was only natural for me to enter that
> context at this point.

I'm not against that, I don't see anything wrong with having
interrupts disabled at that point.

>> @@ -547,13 +555,18 @@ panic(const char *fmt, ...)
>>  {
>>  #ifdef SMP
>>       static volatile u_int panic_cpu = NOCPU;
>> +     cpuset_t other_cpus;
>>  #endif
>>       struct thread *td = curthread;
>>       int bootopt, newpanic;
>>       va_list ap;
>>       static char buf[256];
>>
>> -     critical_enter();
>> +     if (stop_scheduler_on_panic)
>> +             spinlock_enter();
>> +     else
>> +             critical_enter();
>> +
>>
>> - In this chunk I don't entirely understand the kdb_active check:
>> @@ -566,11 +579,18 @@ panic(const char *fmt, ...)
>>                   PCPU_GET(cpuid)) == 0)
>>                       while (panic_cpu != NOCPU)
>>                               ; /* nothing */
>> +     if (stop_scheduler_on_panic) {
>> +             if (panicstr == NULL && !kdb_active) {
>> +                     other_cpus = all_cpus;
>> +                     CPU_CLR(PCPU_GET(cpuid), &other_cpus);
>> +                     stop_cpus_hard(other_cpus);
>> +             }
>> +     }
>>  #endif
>>
>>       bootopt = RB_AUTOBOOT;
>>       newpanic = 0;
>> -     if (panicstr)
>> +     if (panicstr != NULL)
>>               bootopt |= RB_NOSYNC;
>>       else {
>>               bootopt |= RB_DUMP;
>>
>> Is it for avoiding to pass an empty mask to stop_cpus() in kdb_trap()
>> (I saw you changed the policy there)?
>
> Yes.  You know my position about elimination of the cpuset parameter to
> stop_cpus_* and my intention to do so.  This is related to that.  Right now 
> that
> check is not strictly necessary,  but it doesn't do any harm either.  We know
> that all other CPUs are already stopped when kdb_active (ditto for panicstr !=
> NULL).

I see there could be races with disabiling interrupts and having 2
different stopping mechanisms that want to stop cpus, even using
stop_cpus_hard(), on architectures that don't use a privileged channel
(NMI) as mostly of our !x86.
They are mostly very rare races (requiring kdb_trap() and panic() to
happen in parallel on different CPUs).

>> Maybe we can find a better integration among the two.
>
> What kind of integration?
> Right now I have simple model - both stop all other CPUs.

Well, there is no synchronization atm between panic stopping cpus and
the kdb_trap(). When kdb_trap() stop cpus it has interrupts disabled
and if panic already started they will both deadlock because IPI_STOP
won't be properly delivered.
However, I see all this as a problem with other arches not having/not
implementing a real dedicated channel for cpu_stop_hard(), so we
should not think about it now.

I think we may need some sort of control as panic already does with
panic_cpu before to disable interrupts, but don't worry about it now.

>> I'd also move the setting of stop_scheduler variable in the "if", it
>> seems a bug to me to have it set otherwise.
>
> Can you please explain what bug do you suspect here?
> I do not see any.

I just see more natural to move it within the above if (panicstr ==
NULL ...) condition.

> [snip]
>> - I'm not sure I like to change the policies on cpu stopping for KDB
>> with this patchset.
>
> I am not sure what exactly you mean by change in policies.  I do not see any
> such change, entering kdb always stops all other CPUs, with or without the 
> patch.

Yes, I was confused by older code did just stopped CPUs before
kdb_trap() manually, I think what kdb_trap() does now is ok (and you
just retain it).
I'd just change this check on panicstr:
@@ -606,9 +603,13 @@ kdb_trap(int type, int code, struct trapframe *tf)
intr = intr_disable();

 #ifdef SMP
-   other_cpus = all_cpus;
-   CPU_CLR(PCPU_GET(cpuid), &other_cpus);
-   stop_cpus_hard(other_cpus);
+   if (panicstr == NULL) {
+   other_cpus = all_cpus;
+   CPU_CLR(PCPU_GET(cpuid), &other_cpus);
+   stop_cpus_hard(other_cpus);
+   did_stop_cpus = 1;
+   } else
+   did_stop_cpus = 0;

to be SCHEDULER_STOPPED().

If you agree I can fix the kern_mutex, kern_sx and kern_rwlock parts
and it should be done.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: datapoints on 10G throughput with TCP ?

2011-12-06 Thread Daniel O'Connor

On 07/12/2011, at 24:54, Daniel Kalchev wrote:
> It seems performance measurements are more dependent on the server (nuttcp 
> -S) machine.
> We will have to rule out the interrupt storms first of course, any advice?

You can control the storm threshold by setting the hw.intr_storm_threshold 
sysctl.

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Dog Food

2011-12-06 Thread Sean Bruno
Was trying to use gmirror(4) or zfs(4) today to get a machine in the
cluster setup with s/w raid and was completely flummoxed by the
intricacies of manual setup.  Chances are, I just am not smart enough to
wind my way though the various how tos and wiki pages that I've been
browsing to get the job done.

If someone wants to work on modifying bsdinstaller to do s/w raid via
one of these mechanisms, clusteradm@ can provide you a two disk SATA
machine that can be used for this purpose.

Sean


signature.asc
Description: This is a digitally signed message part


Re: binutils-2.22: ld and --copy-dt-needed-entries

2011-12-06 Thread Andriy Gapon
on 06/12/2011 23:24 Martin Matuska said the following:
> On 6.12.2011 17:48, Andriy Gapon wrote:
>> Just for your information.
>> It seems that ld from binutils-2.22 by default has 
>> --no-copy-dt-needed-entries
>> behavior, and so explicit --copy-dt-needed-entries is now needed where the
>> previous default behavior is relied upon.
>>
>> A short excerpt from the man page for your convenience:
>>
>>> This option also has an effect on the resolution of symbols in
>>> dynamic libraries.  With --copy-dt-needed-entries dynamic libraries
>>> mentioned on the command line will be recursively searched,
>>> following their DT_NEEDED tags to other libraries, in order to
>>> resolve symbols required by the output binary.  With the default
>>> setting however the searching of dynamic libraries that follow it
>>> will stop with the dynamic library itself.  No DT_NEEDED links will
>>> be traversed to resolve symbols.
> What do we do with this?
> We can go back, patch to behave as before or to continue.
> Are there any serious complaints?

I am not sure.  Eventually all upstreams of our ports will have to deal with
this.  So far I've encountered only one problematic port (gegl) that links a
binary with -lglib-2.0 expecting that a required -liconv dependency would be
automatically picked up via DT_NEEDED.  libglib-2.0.so indeed has a DT_NEEDED
entry for libiconv.so.  But this dependency is not explicitly advertised via
pkg-config metadata:
$ fgrep -i Libs /usr/local/libdata/pkgconfig/glib-2.0.pc
Libs: -L${libdir} -lglib-2.0
Libs.private: -liconv

So there could be other issues related to this in the future.
Perhaps this is actually an issue with glib, maybe it should have -liconv in
Libs.  I am not really knowledgeable about his stuff.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Stop scheduler on panic

2011-12-06 Thread Andriy Gapon
on 06/12/2011 20:34 Attilio Rao said the following:
[snip]
> - I'm not entirely sure, why we want to disable interrupts at this
> moment (before to stop other CPUs)?:

Because I believe that stop_cpus_hard() should run in a context with interrupts
and preemption disabled.  Also, I believe that the whole panic handling  code
should run in the same context.  So it was only natural for me to enter that
context at this point.

> @@ -547,13 +555,18 @@ panic(const char *fmt, ...)
>  {
>  #ifdef SMP
>   static volatile u_int panic_cpu = NOCPU;
> + cpuset_t other_cpus;
>  #endif
>   struct thread *td = curthread;
>   int bootopt, newpanic;
>   va_list ap;
>   static char buf[256];
> 
> - critical_enter();
> + if (stop_scheduler_on_panic)
> + spinlock_enter();
> + else
> + critical_enter();
> +
> 
> - In this chunk I don't entirely understand the kdb_active check:
> @@ -566,11 +579,18 @@ panic(const char *fmt, ...)
>   PCPU_GET(cpuid)) == 0)
>   while (panic_cpu != NOCPU)
>   ; /* nothing */
> + if (stop_scheduler_on_panic) {
> + if (panicstr == NULL && !kdb_active) {
> + other_cpus = all_cpus;
> + CPU_CLR(PCPU_GET(cpuid), &other_cpus);
> + stop_cpus_hard(other_cpus);
> + }
> + }
>  #endif
> 
>   bootopt = RB_AUTOBOOT;
>   newpanic = 0;
> - if (panicstr)
> + if (panicstr != NULL)
>   bootopt |= RB_NOSYNC;
>   else {
>   bootopt |= RB_DUMP;
> 
> Is it for avoiding to pass an empty mask to stop_cpus() in kdb_trap()
> (I saw you changed the policy there)?

Yes.  You know my position about elimination of the cpuset parameter to
stop_cpus_* and my intention to do so.  This is related to that.  Right now that
check is not strictly necessary,  but it doesn't do any harm either.  We know
that all other CPUs are already stopped when kdb_active (ditto for panicstr !=
NULL).

> Maybe we can find a better integration among the two.

What kind of integration?
Right now I have simple model - both stop all other CPUs.

> I'd also move the setting of stop_scheduler variable in the "if", it
> seems a bug to me to have it set otherwise.

Can you please explain what bug do you suspect here?
I do not see any.

[snip]
> - I'm not sure I like to change the policies on cpu stopping for KDB
> with this patchset.

I am not sure what exactly you mean by change in policies.  I do not see any
such change, entering kdb always stops all other CPUs, with or without the 
patch.

> I think we should discuss this furthermore, in
> particular in terms of reviewing what accesses points KDB has and if
> they are all covered.

Please review the code and see if anything is left to discuss :-)
My review shows that all access points are covered.  Essentially there is only
one access point - kdb_trap.  It's called either directly from a known trap
context or indirectly (via a debug trap) using kdb_enter.

> If someone can summary this up (and has already made the analysis) and
> would please share his findings I'd be happy about it, otherwise we
> should not commit the stop_cpu approach in kdb_trap() IMHO.

I hope that the above answers somewhat clear your concerns.  Just in case, if
you can point out any clear specific problems with this approach, then that can
block me from committing.  A fuzzy feeling that something might be wrong won't
do that :-)

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: binutils-2.22: ld and --copy-dt-needed-entries

2011-12-06 Thread Martin Matuska
On 6.12.2011 17:48, Andriy Gapon wrote:
> Just for your information.
> It seems that ld from binutils-2.22 by default has --no-copy-dt-needed-entries
> behavior, and so explicit --copy-dt-needed-entries is now needed where the
> previous default behavior is relied upon.
>
> A short excerpt from the man page for your convenience:
>
>> This option also has an effect on the resolution of symbols in
>> dynamic libraries.  With --copy-dt-needed-entries dynamic libraries
>> mentioned on the command line will be recursively searched,
>> following their DT_NEEDED tags to other libraries, in order to
>> resolve symbols required by the output binary.  With the default
>> setting however the searching of dynamic libraries that follow it
>> will stop with the dynamic library itself.  No DT_NEEDED links will
>> be traversed to resolve symbols.
What do we do with this?
We can go back, patch to behave as before or to continue.
Are there any serious complaints?

-- 
Martin Matuska
FreeBSD committer
http://blog.vx.sk

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: datapoints on 10G throughput with TCP ?

2011-12-06 Thread Luigi Rizzo
On Tue, Dec 06, 2011 at 07:40:21PM +0200, Daniel Kalchev wrote:
> I see significant difference between number of interrupts on the Intel and 
> the AMD blades. When performing a test between the Intel and AMD blades, the 
> Intel blade generates 20,000-35,000 interrupts, while the AMD blade generates 
> under 1,000 interrupts.
> 

Even in my experiments there is a lot of instability in the results.
I don't know exactly where the problem is, but the high number of
read syscalls, and the huge impact of setting interrupt_rate=0
(defaults at 16us on the ixgbe) makes me think that there is something
that needs investigation in the protocol stack.

Of course we don't want to optimize specifically for the one-flow-at-10G
case, but devising something that makes the system less affected
by short timing variations, and can pass upstream interrupt mitigation
delays would help.

I don't have a solution yet..

cheers
luigi
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: virtualbox r0 memory management update [Was: Freeze with 10.0 and VirtualBox {4.1.4|4.1.6|4.1.51r38464}]

2011-12-06 Thread Andriy Gapon
on 06/12/2011 20:54 Bernhard Froehlich said the following:
> On 06.12.2011 12:14, Andriy Gapon wrote:
>> Meanwhile, here is the code that I came up with:
>> http://people.freebsd.org/~avg/vbox/
>> For your convenience I have uploaded both the new file and its diff
>> against svn
>> head.  I am testing this on FreeBSD head (r228017), so far no
>> breakage observed.
>>
>> I would appreciate reviews and testing of this code.  Especially testing with
>> earlier FreeBSD releases.
> 
> VirtualBox 4.1.6 on FreeBSD 9.0-RC2/amd64 gave me:
> 
> cc -O2 -pipe -DRT_OS_FREEBSD -DIN_RING0 -DIN_RT_R0 -DIN_SUP_R0 -DVBOX
> -DRT_WITH_VBOX -w -DVBOX_WITH_HARDENING -DVBOX_WITH_64_BITS_GUESTS
> -DRT_ARCH_AMD64 -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc 
> -Iinclude -I. -Ir0drv -I. -I@ -I@/contrib/altq -finline-limit=8000 --param
> inline-unit-growth=100 --param large-function-growth=1000 -fno-common 
> -fno-omit-frame-pointer  -mno-sse -mcmodel=kernel -mno-red-zone -mno-mmx
> -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector
> -std=iso9899:1999 -fstack-protector -Wall -Wredundant-decls -Wnested-externs
> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline 
> -Wcast-qual 
> -Wundef -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs
> -fdiagnostics-show-option -c
> /usr/home/decke/rpvbox/emulators/virtualbox-ose-kmod/work/VirtualBox-4.1.6_OSE/out/freebsd.amd64/release/bin/src/vboxdrv/r0drv/freebsd/memobj-r0drv-freebsd.c
> 
> /usr/home/decke/rpvbox/emulators/virtualbox-ose-kmod/work/VirtualBox-4.1.6_OSE/out/freebsd.amd64/release/bin/src/vboxdrv/r0drv/freebsd/memobj-r0drv-freebsd.c:
> In function 'FreeBSDContigPhysAllocHelper':
> /usr/home/decke/rpvbox/emulators/virtualbox-ose-kmod/work/VirtualBox-4.1.6_OSE/out/freebsd.amd64/release/bin/src/vboxdrv/r0drv/freebsd/memobj-r0drv-freebsd.c:212:
> error: incompatible types in initialization
> *** Error code 1
> 
> Stop in
> /usr/home/decke/rpvbox/emulators/virtualbox-ose-kmod/work/VirtualBox-4.1.6_OSE/out/freebsd.amd64/release/bin/src/vboxdrv.
> 
> *** Error code 1
> 

Could you please change that line as follows?
vm_page_t pPage = pPages +iPage;

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Is there a FreeBSD 9+ version of this?

2011-12-06 Thread Warren Block

On Tue, 6 Dec 2011, Sean Bruno wrote:


http://www.freebsd.org/doc/handbook/geom-mirror.html


Not in the Handbook.  To make gmirror work with GPT, create GPT 
partitions and mirror those.  I wrote an article on that using multiple 
partitions: http://www.wonkity.com/~wblock/docs/html/gmirror.html


However, it's since been pointed out that rebuilding the mirror could 
suffer from head contention.  Using a single GPT partition over the 
whole drive should not have that problem, but I haven't tested either 
way.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Stop scheduler on panic

2011-12-06 Thread Attilio Rao
2011/11/13 Kostik Belousov :
> I was tricked into finishing the work by Andrey Gapon, who developed
> the patch to reliably stop other processors on panic.  The patch
> greatly improves the chances of getting dump on panic on SMP host.
> Several people already saw the patchset, and I remember that Andrey
> posted it to some lists.
>
> The change stops other (*) processors early upon the panic.  This way,
> no parallel manipulation of the kernel memory is performed by CPUs.
> In particular, the kernel memory map is static.  Patch prevents the
> panic thread from blocking and switching out.
>
> * - in the context of the description, other means not current.
>
> Since other threads are not run anymore, lock owner cannot release a
> lock which is required by panic thread.  Due to this, we need to fake
> a lock acquisition after the panic, which adds minimal overhead to the
> locking cost. The patch tries to not add any overhead on the fast path
> of the lock acquire.  The check for the after-panic condition was
> reduced to single memory access, done only when the quick cas lock
> attempt failed, and braced with __unlikely compiler hint.
>
> For now, the new mode of operation is disabled by default, since some
> further USB changes are needed to make USB keyboard usable in that
> environment.
>
> With the patch, getting a dump from the machine without debugger
> compiled in is much more realistic.  Please comment, I will commit the
> change in 2 weeks unless strong reasons not to are given.
>
> http://people.freebsd.org/~kib/misc/stop_cpus_on_panic.1.patch

Below are reported my notes on the patch:

- Just by looking at kern_mutex.c chunk I see that you have added
SCHEDULER_STOPPED() checks on very specific places, usually after
sanity checks by WITNESS (and similar) and sometimes in odd places (as
the chunk involving _mtx_lock_spin_flags). I think that we should just
skip all the checks along with the hard locking operation. Ideall we
should also skip the fast path, IMHO, but it is impossible (without
polluting it), but at least skip the vast majority of operations for
the hard one, so that we get, for example:
%svn diff -x -p kern/kern_mutex.c | less
Index: kern/kern_mutex.c
===
--- kern/kern_mutex.c   (revision 228308)
+++ kern/kern_mutex.c   (working copy)
@@ -232,6 +232,8 @@ void
 _mtx_lock_spin_flags(struct mtx *m, int opts, const char *file, int line)
 {

+   if (SCHEDULER_STOPPED())
+   return;
MPASS(curthread != NULL);
KASSERT(m->mtx_lock != MTX_DESTROYED,
("mtx_lock_spin() of destroyed mutex @ %s:%d", file, line));

In this optic I'd patch directly the hard functions rather than
waiting them to hit the smallest possible common point (which are
_mtx_lock_sleep() and _mtx_lock_spin()). That will make the patch more
verbose but more precise and more correct too.

- This chunk is unneeded now:
@@ -577,6 +589,7 @@ retry:
m->mtx_recurse++;
break;
}
+
lock_profile_obtain_lock_failed(&m->lock_object,
&contested, &waittime);
/* Give interrupts a chance while we spin. */

- I'm not entirely sure, why we want to disable interrupts at this
moment (before to stop other CPUs)?:
@@ -547,13 +555,18 @@ panic(const char *fmt, ...)
 {
 #ifdef SMP
static volatile u_int panic_cpu = NOCPU;
+   cpuset_t other_cpus;
 #endif
struct thread *td = curthread;
int bootopt, newpanic;
va_list ap;
static char buf[256];

-   critical_enter();
+   if (stop_scheduler_on_panic)
+   spinlock_enter();
+   else
+   critical_enter();
+

- In this chunk I don't entirely understand the kdb_active check:
@@ -566,11 +579,18 @@ panic(const char *fmt, ...)
PCPU_GET(cpuid)) == 0)
while (panic_cpu != NOCPU)
; /* nothing */
+   if (stop_scheduler_on_panic) {
+   if (panicstr == NULL && !kdb_active) {
+   other_cpus = all_cpus;
+   CPU_CLR(PCPU_GET(cpuid), &other_cpus);
+   stop_cpus_hard(other_cpus);
+   }
+   }
 #endif

bootopt = RB_AUTOBOOT;
newpanic = 0;
-   if (panicstr)
+   if (panicstr != NULL)
bootopt |= RB_NOSYNC;
else {
bootopt |= RB_DUMP;

Is it for avoiding to pass an empty mask to stop_cpus() in kdb_trap()
(I saw you changed the policy there)?
Maybe we can find a better integration among the two.
I'd also move the setting of stop_scheduler variable in the "if", it
seems a bug to me to have it set otherwise.

- The same reservations expressed about the hard path on mutex also
applies to rwlock and sxlock.

- I'm not sure I like to change the policies on cpu stopping

Re: virtualbox r0 memory management update [Was: Freeze with 10.0 and VirtualBox {4.1.4|4.1.6|4.1.51r38464}]

2011-12-06 Thread Bernhard Froehlich

On 06.12.2011 12:14, Andriy Gapon wrote:

Meanwhile, here is the code that I came up with:
http://people.freebsd.org/~avg/vbox/
For your convenience I have uploaded both the new file and its diff
against svn
head.  I am testing this on FreeBSD head (r228017), so far no
breakage observed.

I would appreciate reviews and testing of this code.  Especially 
testing with

earlier FreeBSD releases.


VirtualBox 4.1.6 on FreeBSD 9.0-RC2/amd64 gave me:

cc -O2 -pipe -DRT_OS_FREEBSD -DIN_RING0 -DIN_RT_R0 -DIN_SUP_R0 -DVBOX 
-DRT_WITH_VBOX -w -DVBOX_WITH_HARDENING -DVBOX_WITH_64_BITS_GUESTS 
-DRT_ARCH_AMD64 -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE 
-nostdinc  -Iinclude -I. -Ir0drv -I. -I@ -I@/contrib/altq 
-finline-limit=8000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-common  -fno-omit-frame-pointer  
-mno-sse -mcmodel=kernel -mno-red-zone -mno-mmx -msoft-float  
-fno-asynchronous-unwind-tables -ffreestanding -fstack-protector 
-std=iso9899:1999 -fstack-protector -Wall -Wredundant-decls 
-Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
-Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign 
-fformat-extensions  -Wmissing-include-dirs -fdiagnostics-show-option -c 
/usr/home/decke/rpvbox/emulators/virtualbox-ose-kmod/work/VirtualBox-4.1.6_OSE/out/freebsd.amd64/release/bin/src/vboxdrv/r0drv/freebsd/memobj-r0drv-freebsd.c
/usr/home/decke/rpvbox/emulators/virtualbox-ose-kmod/work/VirtualBox-4.1.6_OSE/out/freebsd.amd64/release/bin/src/vboxdrv/r0drv/freebsd/memobj-r0drv-freebsd.c: 
In function 'FreeBSDContigPhysAllocHelper':
/usr/home/decke/rpvbox/emulators/virtualbox-ose-kmod/work/VirtualBox-4.1.6_OSE/out/freebsd.amd64/release/bin/src/vboxdrv/r0drv/freebsd/memobj-r0drv-freebsd.c:212: 
error: incompatible types in initialization

*** Error code 1

Stop in 
/usr/home/decke/rpvbox/emulators/virtualbox-ose-kmod/work/VirtualBox-4.1.6_OSE/out/freebsd.amd64/release/bin/src/vboxdrv.

*** Error code 1

--
Bernhard Froehlich
http://www.bluelife.at/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFS: i/o error all block copies unavailable Invalid format

2011-12-06 Thread Andriy Gapon
on 06/12/2011 15:41 KOT MATPOCKuH said the following:
> 2011/12/6 Peter Maloney :
>> And just out of curiosity, how did you find the old bootloader?
> I copied zfsloader from system, which has not been updated.
> Also I get zfsloader from weekly ZFS's snapshot.

Additionally, installworld preserves previous loader (and zfsloader) as .old 
file.
Not sure about other means of updating /boot files.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFS: i/o error all block copies unavailable Invalid format

2011-12-06 Thread Andriy Gapon
on 06/12/2011 08:14 KOT MATPOCKuH said the following:
> Hello all!
> 
> On 24 nov I updated sources via csup to RELENG_9 (9.0-PRERELEASE).
> After make installboot I successfully booted to single user.
> But after make installworld system was fail to boot with message:
> ZFS: i/o error all block copies unavailable
> Invalid format
> 
> status command shows status of all polls properly.
> root filesystem is not compressed.
> 
> # zfsboottest /dev/gpt/rootdisk /dev/gpt/rootmirr
>   pool: sunway
> config:
> 
> NAME STATE
> sunway ONLINE
>   mirror ONLINE
> gpt/rootdisk ONLINE
> gpt/rootmirr ONLINE
> 
> Restore of old /boot/zfsloader was solved issue.
> Before I successfully updated 4 another systems with same sources
> level without any problems.
> 
> My sys/boot/zfs/zfsimpl.c's version: 1.17.2.2 2011/11/19 10:49:03
> 
> Where may a root cause of problem? And how I can debug this problem?

1. Inability of the boot code to handle your pool or root filesystem for some 
reason.

2. Try using zfsboottest.sh next time your reproduce the problem (or see what 
the
script does and try to examine some files instead of doing an empty run).  If it
fails, then please try to narrow the problem to a particular file, run 
zfsboottest
under debugger, collect interesting information about the failure and get back 
to
us with it.

The problem may have been masked by the fact that you touched /boot directory 
when
restoring your previous loader.  Alternatively, the problem could be in the 
newer
zfs boot code.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Is there a FreeBSD 9+ version of this?

2011-12-06 Thread Sean Bruno
http://www.freebsd.org/doc/handbook/geom-mirror.html

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: datapoints on 10G throughput with TCP ?

2011-12-06 Thread Daniel Kalchev
I see significant difference between number of interrupts on the Intel and the 
AMD blades. When performing a test between the Intel and AMD blades, the Intel 
blade generates 20,000-35,000 interrupts, while the AMD blade generates under 
1,000 interrupts.

There is no longer throttling, but the performance does not improve.. 

I set it via 

sysctl hw.intr_storm_threshold=0

Should this go to /boot/loader.conf instead.

Daniel

On Dec 6, 2011, at 7:21 PM, Jack Vogel wrote:

> Set the storm threshold to 0, that will disable it, its going to throttle 
> your performance
> when it happens.
> 
> Jack
> 

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: datapoints on 10G throughput with TCP ?

2011-12-06 Thread Jack Vogel
Set the storm threshold to 0, that will disable it, its going to throttle
your performance
when it happens.

Jack


On Tue, Dec 6, 2011 at 6:24 AM, Daniel Kalchev  wrote:

> Some tests with updated FreeBSD to 8-stable as of today, compared with the
> previous run
>
>
>
> On 06.12.11 13:18, Daniel Kalchev wrote:
>
>>
>> FreeBSD 8.2-STABLE #0: Wed Sep 28 11:23:59 EEST 2011
>> CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz (2403.58-MHz
>> K8-class CPU)
>> real memory  = 51539607552 (49152 MB)
>> blade 1:
>>
>> # nuttcp -S
>> # nuttcp -t -T 5 -w 128 -v localhost
>> nuttcp-t: v6.1.2: socket
>> nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> localhost
>> nuttcp-t: time limit = 5.00 seconds
>> nuttcp-t: connect to 127.0.0.1 with mss=14336, RTT=0.044 ms
>> nuttcp-t: send window size = 143360, receive window size = 71680
>> nuttcp-t: 8959.8750 MB in 5.02 real seconds = 1827635.67 KB/sec =
>> 14971.9914 Mbps
>> nuttcp-t: host-retrans = 0
>> nuttcp-t: 143358 I/O calls, msec/call = 0.04, calls/sec = 28556.81
>> nuttcp-t: 0.0user 4.9sys 0:05real 99% 106i+1428d 602maxrss 0+5pf 16+46csw
>>
>> nuttcp-r: v6.1.2: socket
>> nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
>> nuttcp-r: accept from 127.0.0.1
>> nuttcp-r: send window size = 43008, receive window size = 143360
>> nuttcp-r: 8959.8750 MB in 5.17 real seconds = 1773171.07 KB/sec =
>> 14525.8174 Mbps
>> nuttcp-r: 219708 I/O calls, msec/call = 0.02, calls/sec = 42461.43
>> nuttcp-r: 0.0user 3.8sys 0:05real 76% 105i+1407d 614maxrss 1+17pf
>> 95059+22csw
>>
>
> New results:
>
> FreeBSD 8.2-STABLE #1: Tue Dec  6 13:51:01 EET 2011
>
>
>
> # nuttcp -t -T 5 -w 128 -v localhost
> nuttcp-t: v6.1.2: socket
> nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> localhost
> nuttcp-t: time limit = 5.00 seconds
> nuttcp-t: connect to 127.0.0.1 with mss=14336, RTT=0.030 ms
>
> nuttcp-t: send window size = 143360, receive window size = 71680
> nuttcp-t: 12748.0625 MB in 5.02 real seconds = 2599947.38 KB/sec =
> 21298.7689 Mbps
> nuttcp-t: host-retrans = 0
> nuttcp-t: 203969 I/O calls, msec/call = 0.03, calls/sec = 40624.18
> nuttcp-t: 0.0user 4.9sys 0:05real 99% 106i+1428d 620maxrss 0+2pf 1+82csw
>
>
> nuttcp-r: v6.1.2: socket
> nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
> nuttcp-r: accept from 127.0.0.1
> nuttcp-r: send window size = 43008, receive window size = 143360
> nuttcp-r: 12748.0625 MB in 5.15 real seconds = 2536511.81 KB/sec =
> 20779.1048 Mbps
> nuttcp-r: 297000 I/O calls, msec/call = 0.02, calls/sec = 57709.75
> nuttcp-r: 0.1user 4.0sys 0:05real 81% 109i+1469d 626maxrss 0+15pf
> 121136+34csw
>
> Noticeable improvement.
>
>
>
>
>> blade 2:
>>
>> # nuttcp -t -T 5 -w 128 -v 10.2.101.12
>> nuttcp-t: v6.1.2: socket
>> nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.12
>> nuttcp-t: time limit = 5.00 seconds
>> nuttcp-t: connect to 10.2.101.12 with mss=1448, RTT=0.059 ms
>> nuttcp-t: send window size = 131768, receive window size = 66608
>> nuttcp-t: 1340.6469 MB in 5.02 real seconds = 273449.90 KB/sec =
>> 2240.1016 Mbps
>> nuttcp-t: host-retrans = 171
>> nuttcp-t: 21451 I/O calls, msec/call = 0.24, calls/sec = 4272.78
>> nuttcp-t: 0.0user 1.9sys 0:05real 39% 120i+1610d 600maxrss 2+3pf
>> 75658+0csw
>>
>> nuttcp-r: v6.1.2: socket
>> nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
>> nuttcp-r: accept from 10.2.101.11
>> nuttcp-r: send window size = 33304, receive window size = 131768
>> nuttcp-r: 1340.6469 MB in 5.17 real seconds = 265292.92 KB/sec =
>> 2173.2796 Mbps
>> nuttcp-r: 408764 I/O calls, msec/call = 0.01, calls/sec = 78992.15
>> nuttcp-r: 0.0user 3.3sys 0:05real 64% 105i+1413d 620maxrss 0+15pf
>> 105104+102csw
>>
>
> # nuttcp -t -T 5 -w 128 -v 10.2.101.11
> nuttcp-t: v6.1.2: socket
> nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.11
>
> nuttcp-t: time limit = 5.00 seconds
> nuttcp-t: connect to 10.2.101.11 with mss=1448, RTT=0.055 ms
>
> nuttcp-t: send window size = 131768, receive window size = 66608
> nuttcp-t: 1964.8640 MB in 5.02 real seconds = 400757.59 KB/sec = 3283.0062
> Mbps
> nuttcp-t: host-retrans = 0
> nuttcp-t: 31438 I/O calls, msec/call = 0.16, calls/sec = 6261.87
> nuttcp-t: 0.0user 2.7sys 0:05real 55% 112i+1501d 1124maxrss 1+2pf
> 65+112csw
>
>
> nuttcp-r: v6.1.2: socket
> nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
> nuttcp-r: accept from 10.2.101.12
>
> nuttcp-r: send window size = 33304, receive window size = 131768
> nuttcp-r: 1964.8640 MB in 5.15 real seconds = 390972.20 KB/sec = 3202.8442
> Mbps
> nuttcp-r: 560718 I/O calls, msec/call = 0.01, calls/sec = 108957.70
> nuttcp-r: 0.1user 4.2sys 0:05real 84% 111i+1494d 626maxrss 0+15pf
> 151930+16csw
>
> Again, improvement.
>
>
>
>>
>> Another pari of blades:
>>
>> FreeBSD 8.2-STABLE #0: Tue Aug  9 12:37:55 EEST 2011
>> CPU: AMD Opteron(tm) Processor 6134 (2300.04-MHz K8-class CPU)
>> real memory  = 68719476736 (65536 MB)
>>
>> first blade:
>>
>> # nuttcp -S
>> # nuttcp -t -T 5 -w 128 -v localhost
>> nuttcp-t: v6.1.2: socket
>> nuttcp-t: buflen=65

binutils-2.22: ld and --copy-dt-needed-entries

2011-12-06 Thread Andriy Gapon

Just for your information.
It seems that ld from binutils-2.22 by default has --no-copy-dt-needed-entries
behavior, and so explicit --copy-dt-needed-entries is now needed where the
previous default behavior is relied upon.

A short excerpt from the man page for your convenience:

> This option also has an effect on the resolution of symbols in
> dynamic libraries.  With --copy-dt-needed-entries dynamic libraries
> mentioned on the command line will be recursively searched,
> following their DT_NEEDED tags to other libraries, in order to
> resolve symbols required by the output binary.  With the default
> setting however the searching of dynamic libraries that follow it
> will stop with the dynamic library itself.  No DT_NEEDED links will
> be traversed to resolve symbols.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Stop scheduler on panic

2011-12-06 Thread Attilio Rao
2011/12/4 Andriy Gapon :
> on 02/12/2011 19:18 Attilio Rao said the following:
>> BTW, I'm waiting for the details to settle (including the patch we
>> have been discussing internally about binding to CPU0 during ACPI
>> shutdown)
>
> I do not see strong interdependency between that patch and the panic patch.
> BTW, I think that your patch is good to go.

I agree, we can get back to this once the stop_scheduler patch is in.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Stop scheduler on panic

2011-12-06 Thread Attilio Rao
2011/12/4 Andriy Gapon :
> on 21/11/2011 18:58 Attilio Rao said the following:
>> I would be very in favor about having a 'thread trampoline for KDB',
>> thus that it can use locks.
>
> I keep hearing the suggestion to add this trampoline, but I admit that I do 
> not
> understand its technical meaning in this context.  And also how it helps with
> the locking.  So I will appreciate an explanation!  Thanks!

kdb_trap() now runs in interrupt context, my suggestion was to just to
give KDB its own context (a new kernel thread) and yield its execution
when KDB needs to be entered, this way it is possible to use locking
and avoid functions duplications.

In theory, this avoids constructing complicate algorithms to be
lockless when implementing primitives KDB should use.
However, I now realize a problem: if we want to stop CPUs we don't
really want to acquire locks anyway because of CPU restart.
Likely, the KDB trampoline is not a good option for this reason and we
should work instead on getting KDB functions to be totally lockless.

Another thing I'm considering is, however, the entrypoint for KDB.
When I looked into it months ago I thought there is a mismatch between
kdb_enter() (which should disable CPUs) and other ways to enter KDB
(maybe some paths calling directly kdb_trap()?). We must offer an
unified policy and entrypoint, being likely to disable CPUs when
entering it.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Stop scheduler on panic

2011-12-06 Thread Attilio Rao
2011/12/2 Andriy Gapon :
> on 02/12/2011 20:40 John Baldwin said the following:
>> On 12/2/11 12:18 PM, Attilio Rao wrote:
>>> 2011/12/2 John Baldwin:
 On 12/2/11 5:05 AM, Andriy Gapon wrote:
>
> on 02/12/2011 06:36 John Baldwin said the following:
>>
>> Ah, ok (I had thought SCHEDULER_STOPPED was going to always be true when
>> kdb was
>> active).  But I think these two changes should cover critical_exit() ok.
>>
>
> I attempted to start a discussion about this a few times already :-)
> Should we treat kdb context the same as SCHEDULER_STOPPED context (in the
> current definition) ?  That is, skip all locks in the same fashion?
> There are pros and contras.


 kdb should not block on locks, no.  Most debugger commands should not go
 near locks anyway unless they are intended to carefully modify the existing
 system in a safe manner (such as the 'kill' command which should only be
 using try locks and fail if it cannot safely post the signal).
>>>
>>> The biggest problem to KDB as the same as panic is that doing proper
>>> 'continue' is impossible.
>>> One of the features of the 'skip-locking' path is that it doesn't take
>>> into account fast locking paths, where sometimes the lock can succeed
>>> and other fails and you don't know about them. Also the restarted CPUs
>>> can find corrupted datas (as they can be arbitrarely updated), I'm
>>> sure it is too much panic prone.
>>
>> Yes, my thought is that kdb commands, etc. should be using dedicated routines
>> that do not use locks whenever possible.  The problem of a user
>> calling an arbitrary routine is not solvable (so I don't think we should try 
>> to
>> solve that, you use 'call' at your own risk), but built-in commands should
>> explicitly either 1) not use locking, or 2) only use try locks and fail out
>> cleanly (including dropping any try locks acquired) if a try fails.  Now, 
>> that's
>> an ideal view, I don't know how close we are to that in practice or if it is 
>> a
>> realistically attainable goal.
>>
>
>
> I agree with what Attilio and you say.  Initially it was tempting for me to
> apply the same SCHEDULER_STOPPED stopped medicine to the kdb_active context, 
> but
> after trying to deal with kdb_active x SCHEDULER_STOPPED x ukbd situation I
> really changed my mind.
>
>
> I would classify the code that can be called in kdb_active context as follows:
> o debugger code proper (kdb, ddb, gdb stub, etc) - this obviously must not
> (doesn't have to) use any locking
>
> o code that can be invoked via 'call' command - this is essentially any code 
> and
> I don't think that it can/should do anything special for the kdb_active 
> context [*]
>
> o debugger helper routines - those that do something trivial should not 
> acquire
> any locks; those that access shared resources should try the relevant locks 
> and
> bail out if a resource can be in inconsistent state, or should be equipped to
> deal correctly with such a state; this is the same as what you say above
>
> o common code that the debuggers have to use - most obviously this is console
> code and drivers that serve a particular console; on one hand those drivers 
> can
> have a non-trivial state that must be lock protected during normal operation, 
> on
> the other hand the debugger must disregard those locks and grab its console;
> this is the most complex case in my opinion.

Thanks for summarizing this up.
However, please note that code in 2 and 4 entries may have the same
issues or being the same thing, in practice.

Anyway, I'm thinking now that if we really want to stop CPUs when
entering KDB (and I think we do) functions at 2 and 4 should basically
just be totally lockless or in general being totally re-entrant
because when we restart CPUs we don't really want them finding datas
to be corrupted. Also, skipping locking, is totally broken for this
very specific reason.

Functions at point 2 and 4 should be totally lockless then and
possibly just work on read mode. For point 2, specifically, I think we
need an explicit KPI to define functions within the subsystem
themselves (something like DB_SHOW_COMMAND()) which marks undoublty
functions to be called within ddb (the only KDB backend we implement
right now) and likely for functions at point 4 we need to find a way
to stress their belonging to the KDB area.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: datapoints on 10G throughput with TCP ?

2011-12-06 Thread Daniel Kalchev
Some tests with updated FreeBSD to 8-stable as of today, compared with 
the previous run



On 06.12.11 13:18, Daniel Kalchev wrote:


FreeBSD 8.2-STABLE #0: Wed Sep 28 11:23:59 EEST 2011
CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz (2403.58-MHz 
K8-class CPU)

real memory  = 51539607552 (49152 MB)
blade 1:

# nuttcp -S
# nuttcp -t -T 5 -w 128 -v localhost
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> localhost
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 127.0.0.1 with mss=14336, RTT=0.044 ms
nuttcp-t: send window size = 143360, receive window size = 71680
nuttcp-t: 8959.8750 MB in 5.02 real seconds = 1827635.67 KB/sec = 
14971.9914 Mbps

nuttcp-t: host-retrans = 0
nuttcp-t: 143358 I/O calls, msec/call = 0.04, calls/sec = 28556.81
nuttcp-t: 0.0user 4.9sys 0:05real 99% 106i+1428d 602maxrss 0+5pf 16+46csw

nuttcp-r: v6.1.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
nuttcp-r: accept from 127.0.0.1
nuttcp-r: send window size = 43008, receive window size = 143360
nuttcp-r: 8959.8750 MB in 5.17 real seconds = 1773171.07 KB/sec = 
14525.8174 Mbps

nuttcp-r: 219708 I/O calls, msec/call = 0.02, calls/sec = 42461.43
nuttcp-r: 0.0user 3.8sys 0:05real 76% 105i+1407d 614maxrss 1+17pf 
95059+22csw


New results:

FreeBSD 8.2-STABLE #1: Tue Dec  6 13:51:01 EET 2011


# nuttcp -t -T 5 -w 128 -v localhost
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> localhost
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 127.0.0.1 with mss=14336, RTT=0.030 ms
nuttcp-t: send window size = 143360, receive window size = 71680
nuttcp-t: 12748.0625 MB in 5.02 real seconds = 2599947.38 KB/sec = 
21298.7689 Mbps

nuttcp-t: host-retrans = 0
nuttcp-t: 203969 I/O calls, msec/call = 0.03, calls/sec = 40624.18
nuttcp-t: 0.0user 4.9sys 0:05real 99% 106i+1428d 620maxrss 0+2pf 1+82csw

nuttcp-r: v6.1.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
nuttcp-r: accept from 127.0.0.1
nuttcp-r: send window size = 43008, receive window size = 143360
nuttcp-r: 12748.0625 MB in 5.15 real seconds = 2536511.81 KB/sec = 
20779.1048 Mbps

nuttcp-r: 297000 I/O calls, msec/call = 0.02, calls/sec = 57709.75
nuttcp-r: 0.1user 4.0sys 0:05real 81% 109i+1469d 626maxrss 0+15pf 
121136+34csw


Noticeable improvement.




blade 2:

# nuttcp -t -T 5 -w 128 -v 10.2.101.12
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.12
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 10.2.101.12 with mss=1448, RTT=0.059 ms
nuttcp-t: send window size = 131768, receive window size = 66608
nuttcp-t: 1340.6469 MB in 5.02 real seconds = 273449.90 KB/sec = 
2240.1016 Mbps

nuttcp-t: host-retrans = 171
nuttcp-t: 21451 I/O calls, msec/call = 0.24, calls/sec = 4272.78
nuttcp-t: 0.0user 1.9sys 0:05real 39% 120i+1610d 600maxrss 2+3pf 
75658+0csw


nuttcp-r: v6.1.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
nuttcp-r: accept from 10.2.101.11
nuttcp-r: send window size = 33304, receive window size = 131768
nuttcp-r: 1340.6469 MB in 5.17 real seconds = 265292.92 KB/sec = 
2173.2796 Mbps

nuttcp-r: 408764 I/O calls, msec/call = 0.01, calls/sec = 78992.15
nuttcp-r: 0.0user 3.3sys 0:05real 64% 105i+1413d 620maxrss 0+15pf 
105104+102csw


# nuttcp -t -T 5 -w 128 -v 10.2.101.11
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.11
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 10.2.101.11 with mss=1448, RTT=0.055 ms
nuttcp-t: send window size = 131768, receive window size = 66608
nuttcp-t: 1964.8640 MB in 5.02 real seconds = 400757.59 KB/sec = 
3283.0062 Mbps

nuttcp-t: host-retrans = 0
nuttcp-t: 31438 I/O calls, msec/call = 0.16, calls/sec = 6261.87
nuttcp-t: 0.0user 2.7sys 0:05real 55% 112i+1501d 1124maxrss 1+2pf 
65+112csw


nuttcp-r: v6.1.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
nuttcp-r: accept from 10.2.101.12
nuttcp-r: send window size = 33304, receive window size = 131768
nuttcp-r: 1964.8640 MB in 5.15 real seconds = 390972.20 KB/sec = 
3202.8442 Mbps

nuttcp-r: 560718 I/O calls, msec/call = 0.01, calls/sec = 108957.70
nuttcp-r: 0.1user 4.2sys 0:05real 84% 111i+1494d 626maxrss 0+15pf 
151930+16csw


Again, improvement.





Another pari of blades:

FreeBSD 8.2-STABLE #0: Tue Aug  9 12:37:55 EEST 2011
CPU: AMD Opteron(tm) Processor 6134 (2300.04-MHz K8-class CPU)
real memory  = 68719476736 (65536 MB)

first blade:

# nuttcp -S
# nuttcp -t -T 5 -w 128 -v localhost
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> localhost
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 127.0.0.1 with mss=14336, RTT=0.090 ms
nuttcp-t: send window size = 143360, receive window size = 71680
nuttcp-t: 2695.0625 MB in 5.00 real seconds = 551756.90 KB/sec = 
4519.9925 Mbps

nuttcp-t: host-retrans = 0
nuttcp-t: 43121 I/O calls, msec/call = 0.12, calls/sec = 8621.20
nuttcp-t: 0.0user 4.9sys 0:05real 99% 106i+1428d 620maxrss 0+4pf 2+71csw

nuttcp-r: v6.

Re: ZFS: i/o error all block copies unavailable Invalid format

2011-12-06 Thread KOT MATPOCKuH
2011/12/6 Peter Maloney :
>>> "Invalid format" sounds like the software doesn't understand the disks.
>>> Check your pool (software) version with:
>>> # zpool upgrade -v
>> zpool upgrade -v does not show pools, available for upgrade :)
>>
>> # zpool upgrade
>> This system is currently running ZFS pool version 28.
>>
>> All pools are formatted using this version.
>>
>>> Check your pool (on disk) version with (I forget the exact command):
>>> # zpool get version sunway
>> NAME    PROPERTY  VALUE    SOURCE
>> sunway  version   28       default
>>
>> It's latest pool's version for RELENG_9.
>>
>>> My guess is that you installed the latest zfs on the pool, but left the
>>> old version of the bootloader.
>> You mean gptzfsboot ?
> Yes.
>> Old gptzfsboot must fail with message like this:
>> ZFS: unsupported ZFS version %u (should be %u)
>>
>> And why problem solved by copying previous zfsloader?
>> Without any another changes...
>>
> previous zfsloader? Oh how interesting. I missed that in your last message.
>
> When you updated the other 4 systems "with same sources" did you mean
> the same cvsup file, or the exact copy of the source?
I used same cvsup file from same cvsup mirror at same time...
sys/boot/zfs/zfsimpl.c have same version.
Only one difference for this system - it uses SAS drives, another
systems have IDE and SATA.


> And just out of curiosity, how did you find the old bootloader?
I copied zfsloader from system, which has not been updated.
Also I get zfsloader from weekly ZFS's snapshot.

> Did you also try copying the bootloader (with dd maybe) from one of the
> working updated systems? Or comparing checksums of the bootloaders?
All checksums are different...
If possible, I will try to boot the system with all available zfsloaders:
- old from this system
- again new from this system
- old from another system
- new from another system

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Freeze with 10.0 and VirtualBox {4.1.4|4.1.6|4.1.51r38464}

2011-12-06 Thread Andriy Gapon
on 06/12/2011 14:51 Bernhard Froehlich said the following:
> 
> No. We currently have the last 2 virtualbox major versions in the ports tree 
> so
> if we decide to do such an incompatible change we always have the older 
> version
> around for 8.2 users for about the next year.

Oh, I missed the fact that we have port versions for virtualbox ports.
Sorry for the noise, then.
Anyway, in patch I tried to cover different FreeBSD versions.
One greater discrepancy between stable/8 and later version is the changes in
page locking that have never been MFC-ed.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Freeze with 10.0 and VirtualBox {4.1.4|4.1.6|4.1.51r38464}

2011-12-06 Thread Bernhard Froehlich

On 06.12.2011 12:00, Andriy Gapon wrote:

on 06/12/2011 11:40 Bernhard Froehlich said the following:

On 06.12.2011 10:08, Andriy Gapon wrote:

on 06/12/2011 02:08 Alan Cox said the following:
The right thing to do is to MFC vm_page_alloc_contig().  It 
shouldn't be that

hard to merge it.


Ah, but we want to be able to run virtualbox on the past releases 
too.


Which releases are we talking about here? VirtualBox was always 
deprecating
older FreeBSD releases much faster than our Security EOL so if we 
focus on latest

8.x and >= 9.0 I think it should be okay.

We have already deprecated 7.x because of some other problems with 
the userland
bits so we can forget about 7.x completely. What is left is 8.1 and 
8.2 and both
are EOL until mid 2012. So if is possible to get it merged in time 
for 8.3 and

9.1 (if needed) we should be fine.


So we would just say "screw you" to people who stick to e.g. 8.2 at
the moment? :)


No. We currently have the last 2 virtualbox major versions in the ports 
tree so
if we decide to do such an incompatible change we always have the older 
version

around for 8.2 users for about the next year.

--
Bernhard Froehlich
http://www.bluelife.at/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFS: i/o error all block copies unavailable Invalid format

2011-12-06 Thread Peter Maloney
On 12/06/2011 12:06 PM, KOT MATPOCKuH wrote:
> Hello!
>
> 2011/12/6 Peter Maloney :
>
>> "Invalid format" sounds like the software doesn't understand the disks.
>> Check your pool (software) version with:
>> # zpool upgrade -v
> zpool upgrade -v does not show pools, available for upgrade :)
>
> # zpool upgrade
> This system is currently running ZFS pool version 28.
>
> All pools are formatted using this version.
>
>> Check your pool (on disk) version with (I forget the exact command):
>> # zpool get version sunway
> NAMEPROPERTY  VALUESOURCE
> sunway  version   28   default
>
> It's latest pool's version for RELENG_9.
>
>> My guess is that you installed the latest zfs on the pool, but left the
>> old version of the bootloader.
> You mean gptzfsboot ?
Yes.
> Old gptzfsboot must fail with message like this:
> ZFS: unsupported ZFS version %u (should be %u)
>
> And why problem solved by copying previous zfsloader?
> Without any another changes...
>
previous zfsloader? Oh how interesting. I missed that in your last message.

When you updated the other 4 systems "with same sources" did you mean
the same cvsup file, or the exact copy of the source? I often see people
posting about some mirrors updating later than others [I forget if this
applies to current or stable or both], so I wouldn't trust them to have
the same download each time, or a consistent download given a date
unless it is a distant date.

And just out of curiosity, how did you find the old bootloader? I
probably wouldn't think to back it up if the new one compiled without error.

Did you also try copying the bootloader (with dd maybe) from one of the
working updated systems? Or comparing checksums of the bootloaders?

-- 


Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.malo...@brockmann-consult.de
Internet: http://www.brockmann-consult.de


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: datapoints on 10G throughput with TCP ?

2011-12-06 Thread Daniel Kalchev



On 06.12.11 13:18, Daniel Kalchev wrote:

[...]
second blade:

# nuttcp -t -T 5 -w 128 -v 10.2.101.13
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.13
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 10.2.101.13 with mss=1448, RTT=0.164 ms
nuttcp-t: send window size = 131768, receive window size = 66608
nuttcp-t: 1290.3750 MB in 5.00 real seconds = 264173.96 KB/sec = 
2164.1131 Mbps

nuttcp-t: host-retrans = 0
nuttcp-t: 20646 I/O calls, msec/call = 0.25, calls/sec = 4127.72
nuttcp-t: 0.0user 3.8sys 0:05real 77% 96i+1299d 616maxrss 0+3pf 
27389+0csw


nuttcp-r: v6.1.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
nuttcp-r: accept from 10.2.101.14
nuttcp-r: send window size = 33304, receive window size = 131768
nuttcp-r: 1290.3750 MB in 5.14 real seconds = 256835.92 KB/sec = 
2103.9998 Mbps

nuttcp-r: 85668 I/O calls, msec/call = 0.06, calls/sec = 16651.70
nuttcp-r: 0.0user 4.8sys 0:05real 94% 107i+1437d 624maxrss 0+15pf 
11848+0csw



Not impresive... I am rebuilding now to -stable.

Daniel


I also noticed interrupt storms happening while this was running on the 
second pair of blades:


interrupt storm detected on "irq272:"; throttling interrupt source
interrupt storm detected on "irq272:"; throttling interrupt source
interrupt storm detected on "irq272:"; throttling interrupt source
interrupt storm detected on "irq270:"; throttling interrupt source
interrupt storm detected on "irq270:"; throttling interrupt source
interrupt storm detected on "irq270:"; throttling interrupt source

some stats

# sysctl -a dev.ix.1
dev.ix.1.%desc: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 
2.3.10

dev.ix.1.%driver: ix
dev.ix.1.%location: slot=0 function=1
dev.ix.1.%pnpinfo: vendor=0x8086 device=0x10fc subvendor=0x 
subdevice=0x class=0x02

dev.ix.1.%parent: pci3
dev.ix.1.flow_control: 3
dev.ix.1.advertise_gig: 0
dev.ix.1.enable_aim: 1
dev.ix.1.rx_processing_limit: 128
dev.ix.1.dropped: 0
dev.ix.1.mbuf_defrag_failed: 0
dev.ix.1.no_tx_dma_setup: 0
dev.ix.1.watchdog_events: 0
dev.ix.1.tso_tx: 1193460
dev.ix.1.link_irq: 1
dev.ix.1.queue0.interrupt_rate: 100
dev.ix.1.queue0.txd_head: 45
dev.ix.1.queue0.txd_tail: 45
dev.ix.1.queue0.no_desc_avail: 0
dev.ix.1.queue0.tx_packets: 23
dev.ix.1.queue0.rxd_head: 16
dev.ix.1.queue0.rxd_tail: 15
dev.ix.1.queue0.rx_packets: 16
dev.ix.1.queue0.rx_bytes: 2029
dev.ix.1.queue0.lro_queued: 0
dev.ix.1.queue0.lro_flushed: 0
dev.ix.1.queue1.interrupt_rate: 62500
dev.ix.1.queue1.txd_head: 0
dev.ix.1.queue1.txd_tail: 0
dev.ix.1.queue1.no_desc_avail: 0
dev.ix.1.queue1.tx_packets: 0
dev.ix.1.queue1.rxd_head: 0
dev.ix.1.queue1.rxd_tail: 2047
dev.ix.1.queue1.rx_packets: 0
dev.ix.1.queue1.rx_bytes: 0
dev.ix.1.queue1.lro_queued: 0
dev.ix.1.queue1.lro_flushed: 0
dev.ix.1.queue2.interrupt_rate: 20
dev.ix.1.queue2.txd_head: 545
dev.ix.1.queue2.txd_tail: 545
dev.ix.1.queue2.no_desc_avail: 0
dev.ix.1.queue2.tx_packets: 331690
dev.ix.1.queue2.rxd_head: 1099
dev.ix.1.queue2.rxd_tail: 1098
dev.ix.1.queue2.rx_packets: 498763
dev.ix.1.queue2.rx_bytes: 32954702
dev.ix.1.queue2.lro_queued: 0
dev.ix.1.queue2.lro_flushed: 0
dev.ix.1.queue3.interrupt_rate: 62500
dev.ix.1.queue3.txd_head: 0
dev.ix.1.queue3.txd_tail: 0
dev.ix.1.queue3.no_desc_avail: 0
dev.ix.1.queue3.tx_packets: 0
dev.ix.1.queue3.rxd_head: 0
dev.ix.1.queue3.rxd_tail: 2047
dev.ix.1.queue3.rx_packets: 0
dev.ix.1.queue3.rx_bytes: 0
dev.ix.1.queue3.lro_queued: 0
dev.ix.1.queue3.lro_flushed: 0
dev.ix.1.queue4.interrupt_rate: 100
dev.ix.1.queue4.txd_head: 13
dev.ix.1.queue4.txd_tail: 13
dev.ix.1.queue4.no_desc_avail: 0
dev.ix.1.queue4.tx_packets: 6
dev.ix.1.queue4.rxd_head: 6
dev.ix.1.queue4.rxd_tail: 5
dev.ix.1.queue4.rx_packets: 6
dev.ix.1.queue4.rx_bytes: 899
dev.ix.1.queue4.lro_queued: 0
dev.ix.1.queue4.lro_flushed: 0
dev.ix.1.queue5.interrupt_rate: 20
dev.ix.1.queue5.txd_head: 982
dev.ix.1.queue5.txd_tail: 982
dev.ix.1.queue5.no_desc_avail: 0
dev.ix.1.queue5.tx_packets: 302592
dev.ix.1.queue5.rxd_head: 956
dev.ix.1.queue5.rxd_tail: 955
dev.ix.1.queue5.rx_packets: 474044
dev.ix.1.queue5.rx_bytes: 31319840
dev.ix.1.queue5.lro_queued: 0
dev.ix.1.queue5.lro_flushed: 0
dev.ix.1.queue6.interrupt_rate: 20
dev.ix.1.queue6.txd_head: 1902
dev.ix.1.queue6.txd_tail: 1902
dev.ix.1.queue6.no_desc_avail: 0
dev.ix.1.queue6.tx_packets: 184922
dev.ix.1.queue6.rxd_head: 1410
dev.ix.1.queue6.rxd_tail: 1409
dev.ix.1.queue6.rx_packets: 402818
dev.ix.1.queue6.rx_bytes: 27759640
dev.ix.1.queue6.lro_queued: 0
dev.ix.1.queue6.lro_flushed: 0
dev.ix.1.queue7.interrupt_rate: 20
dev.ix.1.queue7.txd_head: 660
dev.ix.1.queue7.txd_tail: 660
dev.ix.1.queue7.no_desc_avail: 0
dev.ix.1.queue7.tx_packets: 378078
dev.ix.1.queue7.rxd_head: 885
dev.ix.1.queue7.rxd_tail: 884
dev.ix.1.queue7.rx_packets: 705397
dev.ix.1.queue7.rx_bytes: 46572290
dev.ix.1.queue7.lro_queued: 0
dev.ix.1.queue7.lro_flushed: 0
dev.ix.1.mac_stats.crc_errs: 0
dev.ix.1.mac_stats.ill_errs: 0
dev.ix.1.mac_stats.byt

Re: datapoints on 10G throughput with TCP ?

2011-12-06 Thread Daniel Kalchev
Here is what I get, with an existing install, no tuning other than 
kern.ipc.nmbclusters=512000


Pair of Supermicro blades:

FreeBSD 8.2-STABLE #0: Wed Sep 28 11:23:59 EEST 2011
CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz (2403.58-MHz 
K8-class CPU)

real memory  = 51539607552 (49152 MB)
[...]
ix0:  
port 0xdc00-0xdc1f mem 0xfbc0-0xfbdf,0xfbbfc000-0xfbbf irq 
16 at device 0.0 on pci3

ix0: Using MSIX interrupts with 9 vectors
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: Ethernet address: xx:xx:xx:xx:xx:xx
ix0: PCI Express Bus: Speed 5.0Gb/s Width x8
ix1:  
port 0xd880-0xd89f mem 0xfb80-0xfb9f,0xfbbf8000-0xfbbfbfff irq 
17 at device 0.1 on pci3

ix1: Using MSIX interrupts with 9 vectors
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: Ethernet address: xx:xx:xx:xx:xx:xx
ix1: PCI Express Bus: Speed 5.0Gb/s Width x8


blade 1:

# nuttcp -S
# nuttcp -t -T 5 -w 128 -v localhost
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> localhost
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 127.0.0.1 with mss=14336, RTT=0.044 ms
nuttcp-t: send window size = 143360, receive window size = 71680
nuttcp-t: 8959.8750 MB in 5.02 real seconds = 1827635.67 KB/sec = 
14971.9914 Mbps

nuttcp-t: host-retrans = 0
nuttcp-t: 143358 I/O calls, msec/call = 0.04, calls/sec = 28556.81
nuttcp-t: 0.0user 4.9sys 0:05real 99% 106i+1428d 602maxrss 0+5pf 16+46csw

nuttcp-r: v6.1.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
nuttcp-r: accept from 127.0.0.1
nuttcp-r: send window size = 43008, receive window size = 143360
nuttcp-r: 8959.8750 MB in 5.17 real seconds = 1773171.07 KB/sec = 
14525.8174 Mbps

nuttcp-r: 219708 I/O calls, msec/call = 0.02, calls/sec = 42461.43
nuttcp-r: 0.0user 3.8sys 0:05real 76% 105i+1407d 614maxrss 1+17pf 
95059+22csw


blade 2:

# nuttcp -t -T 5 -w 128 -v 10.2.101.12
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.12
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 10.2.101.12 with mss=1448, RTT=0.059 ms
nuttcp-t: send window size = 131768, receive window size = 66608
nuttcp-t: 1340.6469 MB in 5.02 real seconds = 273449.90 KB/sec = 
2240.1016 Mbps

nuttcp-t: host-retrans = 171
nuttcp-t: 21451 I/O calls, msec/call = 0.24, calls/sec = 4272.78
nuttcp-t: 0.0user 1.9sys 0:05real 39% 120i+1610d 600maxrss 2+3pf 75658+0csw

nuttcp-r: v6.1.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
nuttcp-r: accept from 10.2.101.11
nuttcp-r: send window size = 33304, receive window size = 131768
nuttcp-r: 1340.6469 MB in 5.17 real seconds = 265292.92 KB/sec = 
2173.2796 Mbps

nuttcp-r: 408764 I/O calls, msec/call = 0.01, calls/sec = 78992.15
nuttcp-r: 0.0user 3.3sys 0:05real 64% 105i+1413d 620maxrss 0+15pf 
105104+102csw



Another pari of blades:

FreeBSD 8.2-STABLE #0: Tue Aug  9 12:37:55 EEST 2011
CPU: AMD Opteron(tm) Processor 6134 (2300.04-MHz K8-class CPU)
real memory  = 68719476736 (65536 MB)
[...]
ix0:  
port 0xe400-0xe41f mem 0xfe60-0xfe7f,0xfe4fc000-0xfe4f irq 
19 at device 0.0 on pci3

ix0: Using MSIX interrupts with 9 vectors
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: [ITHREAD]
ix0: Ethernet address: xx:xx:xx:xx:xx:xx
ix0: PCI Express Bus: Speed 5.0Gb/s Width x8
ix1:  
port 0xe800-0xe81f mem 0xfea0-0xfebf,0xfe8fc000-0xfe8f irq 
16 at device 0.1 on pci3

ix1: Using MSIX interrupts with 9 vectors
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: [ITHREAD]
ix1: Ethernet address: xx:xx:xx:xx:xx:xx
ix1: PCI Express Bus: Speed 5.0Gb/s Width x8

first blade:

# nuttcp -S
# nuttcp -t -T 5 -w 128 -v localhost
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> localhost
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 127.0.0.1 with mss=14336, RTT=0.090 ms
nuttcp-t: send window size = 143360, receive window size = 71680
nuttcp-t: 2695.0625 MB in 5.00 real seconds = 551756.90 KB/sec = 
4519.9925 Mbps

nuttcp-t: host-retrans = 0
nuttcp-t: 43121 I/O calls, msec/call = 0.12, calls/sec = 8621.20
nuttcp-t: 0.0user 4.9sys 0:05real 99% 106i+1428d 620maxrss 0+4pf 2+71csw

nuttcp-r: v6.1.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
nuttcp-r: accept from 127.0.0.1
nuttcp-r: send window size = 43008, receive window size = 143360
nuttcp-r: 2695.0625 MB in 5.14 real seconds = 536509.66 KB/sec = 
4395.0871 Mbps

nuttcp-r: 43126 I/O calls, msec/call = 0.12, calls/sec = 8383.94
nuttcp-r: 0.0user 3.1sys 0:05real 61% 94i+1264d 624maxrss 1+16pf 43019+0csw

second blade:

# nuttcp -t -T 5 -w 128 -v 10.2.101.13
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 10.2.101.13
nuttcp-t: time 

virtualbox r0 memory management update [Was: Freeze with 10.0 and VirtualBox {4.1.4|4.1.6|4.1.51r38464}]

2011-12-06 Thread Andriy Gapon

Meanwhile, here is the code that I came up with:
http://people.freebsd.org/~avg/vbox/
For your convenience I have uploaded both the new file and its diff against svn
head.  I am testing this on FreeBSD head (r228017), so far no breakage observed.

I would appreciate reviews and testing of this code.  Especially testing with
earlier FreeBSD releases.

Thank you!
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFS: i/o error all block copies unavailable Invalid format

2011-12-06 Thread KOT MATPOCKuH
Hello!

2011/12/6 Peter Maloney :

> "Invalid format" sounds like the software doesn't understand the disks.
> Check your pool (software) version with:
> # zpool upgrade -v
zpool upgrade -v does not show pools, available for upgrade :)

# zpool upgrade
This system is currently running ZFS pool version 28.

All pools are formatted using this version.

> Check your pool (on disk) version with (I forget the exact command):
> # zpool get version sunway
NAMEPROPERTY  VALUESOURCE
sunway  version   28   default

It's latest pool's version for RELENG_9.

> My guess is that you installed the latest zfs on the pool, but left the
> old version of the bootloader.
You mean gptzfsboot ?

Old gptzfsboot must fail with message like this:
ZFS: unsupported ZFS version %u (should be %u)

And why problem solved by copying previous zfsloader?
Without any another changes...

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Freeze with 10.0 and VirtualBox {4.1.4|4.1.6|4.1.51r38464}

2011-12-06 Thread Andriy Gapon
on 06/12/2011 11:40 Bernhard Froehlich said the following:
> On 06.12.2011 10:08, Andriy Gapon wrote:
>> on 06/12/2011 02:08 Alan Cox said the following:
>>> The right thing to do is to MFC vm_page_alloc_contig().  It shouldn't be 
>>> that
>>> hard to merge it.
>>
>> Ah, but we want to be able to run virtualbox on the past releases too.
> 
> Which releases are we talking about here? VirtualBox was always deprecating
> older FreeBSD releases much faster than our Security EOL so if we focus on 
> latest
> 8.x and >= 9.0 I think it should be okay.
> 
> We have already deprecated 7.x because of some other problems with the 
> userland
> bits so we can forget about 7.x completely. What is left is 8.1 and 8.2 and 
> both
> are EOL until mid 2012. So if is possible to get it merged in time for 8.3 and
> 9.1 (if needed) we should be fine.

So we would just say "screw you" to people who stick to e.g. 8.2 at the moment? 
:)

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Freeze with 10.0 and VirtualBox {4.1.4|4.1.6|4.1.51r38464}

2011-12-06 Thread Bernhard Froehlich

On 06.12.2011 10:08, Andriy Gapon wrote:

on 06/12/2011 02:08 Alan Cox said the following:
The right thing to do is to MFC vm_page_alloc_contig().  It 
shouldn't be that

hard to merge it.


Ah, but we want to be able to run virtualbox on the past releases 
too.


Which releases are we talking about here? VirtualBox was always 
deprecating
older FreeBSD releases much faster than our Security EOL so if we focus 
on latest

8.x and >= 9.0 I think it should be okay.

We have already deprecated 7.x because of some other problems with the 
userland
bits so we can forget about 7.x completely. What is left is 8.1 and 8.2 
and both
are EOL until mid 2012. So if is possible to get it merged in time for 
8.3 and

9.1 (if needed) we should be fine.

--
Bernhard Froehlich
http://www.bluelife.at/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Freeze with 10.0 and VirtualBox {4.1.4|4.1.6|4.1.51r38464}

2011-12-06 Thread Andriy Gapon
on 06/12/2011 02:08 Alan Cox said the following:
> The right thing to do is to MFC vm_page_alloc_contig().  It shouldn't be that
> hard to merge it.

Ah, but we want to be able to run virtualbox on the past releases too.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"