Jenkins build is still unstable: FreeBSD_stable_10 #429

2016-10-17 Thread jenkins-admin
https://jenkins.FreeBSD.org/job/FreeBSD_stable_10/429/
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Steven Hartland



On 17/10/2016 22:50, Karl Denninger wrote:

I will make some effort on the sandbox machine to see if I can come up
with a way to replicate this.  I do have plenty of spare larger drives
laying around that used to be in service and were obsolesced due to
capacity -- but what I don't know if whether the system will misbehave
if the source is all spinning rust.

In other words:

1. Root filesystem is mirrored spinning rust (production is mirrored SSDs)

2. Backup is mirrored spinning rust (of approx the same size)

3. Set up auto-snapshot exactly as the production system has now (which
the sandbox is NOT since I don't care about incremental recovery on that
machine; it's a sandbox!)

4. Run a bunch of build-somethings (e.g. buildworlds, cross-build for
the Pi2s I have here, etc) to generate a LOT of filesystem entropy
across lots of snapshots.

5. Back that up.

6. Export the backup pool.

7. Re-import it and "zfs destroy -r" the backup filesystem.

That is what got me in a reboot loop after the *first* panic; I was
simply going to destroy the backup filesystem and re-run the backup, but
as soon as I issued that zfs destroy the machine panic'd and as soon as
I re-attached it after a reboot it panic'd again.  Repeat until I set
trim=0.

But... if I CAN replicate it that still shouldn't be happening, and the
system should *certainly* survive attempting to TRIM on a vdev that
doesn't support TRIMs, even if the removal is for a large amount of
space and/or files on the target, without blowing up.

BTW I bet it isn't that rare -- if you're taking timed snapshots on an
active filesystem (with lots of entropy) and then make the mistake of
trying to remove those snapshots (as is the case with a zfs destroy -r
or a zfs recv of an incremental copy that attempts to sync against a
source) on a pool that has been imported before the system realizes that
TRIM is unavailable on those vdevs.

Noting this:

 Yes need to find some time to have a look at it, but given how rare
 this is and with TRIM being re-implemented upstream in a totally
 different manor I'm reticent to spend any real time on it.

What's in-process in this regard, if you happen to have a reference?

Looks like it may be still in review: https://reviews.csiden.org/r/263/

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Karl Denninger
I will make some effort on the sandbox machine to see if I can come up
with a way to replicate this.  I do have plenty of spare larger drives
laying around that used to be in service and were obsolesced due to
capacity -- but what I don't know if whether the system will misbehave
if the source is all spinning rust.

In other words:

1. Root filesystem is mirrored spinning rust (production is mirrored SSDs)

2. Backup is mirrored spinning rust (of approx the same size)

3. Set up auto-snapshot exactly as the production system has now (which
the sandbox is NOT since I don't care about incremental recovery on that
machine; it's a sandbox!)

4. Run a bunch of build-somethings (e.g. buildworlds, cross-build for
the Pi2s I have here, etc) to generate a LOT of filesystem entropy
across lots of snapshots.

5. Back that up.

6. Export the backup pool.

7. Re-import it and "zfs destroy -r" the backup filesystem.

That is what got me in a reboot loop after the *first* panic; I was
simply going to destroy the backup filesystem and re-run the backup, but
as soon as I issued that zfs destroy the machine panic'd and as soon as
I re-attached it after a reboot it panic'd again.  Repeat until I set
trim=0.

But... if I CAN replicate it that still shouldn't be happening, and the
system should *certainly* survive attempting to TRIM on a vdev that
doesn't support TRIMs, even if the removal is for a large amount of
space and/or files on the target, without blowing up.

BTW I bet it isn't that rare -- if you're taking timed snapshots on an
active filesystem (with lots of entropy) and then make the mistake of
trying to remove those snapshots (as is the case with a zfs destroy -r
or a zfs recv of an incremental copy that attempts to sync against a
source) on a pool that has been imported before the system realizes that
TRIM is unavailable on those vdevs.

Noting this:

Yes need to find some time to have a look at it, but given how rare
this is and with TRIM being re-implemented upstream in a totally
different manor I'm reticent to spend any real time on it.

What's in-process in this regard, if you happen to have a reference?

On 10/17/2016 16:40, Steven Hartland wrote:
> Setting those values will only effect what's queued to the device not
> what's actually outstanding.
>
> On 17/10/2016 21:22, Karl Denninger wrote:
>> Since I cleared it (by setting TRIM off on the test machine, rebooting,
>> importing the pool and noting that it did not panic -- pulled drives,
>> re-inserted into the production machine and ran backup routine -- all
>> was normal) it may be a while before I see it again (a week or so is
>> usual.)
>>
>> It appears to be related to entropy in the filesystem that comes up as
>> "eligible" to be removed from the backup volume, which (not
>> surprisingly) tends to happen a few days after I do a new world build or
>> something similar (the daily and/or periodic snapshots roll off at about
>> that point.)
>>
>> I don't happen to have a spare pair of high-performance SSDs I can stick
>> in the sandbox machine in an attempt to force the condition to assert
>> itself in test, unfortunately.
>>
>> I *am* concerned that it's not "simple" stack exhaustion because setting
>> the max outstanding TRIMs on a per-vdev basis down quite-dramatically
>> did *not* prevent it from happening -- and if it was simply stack depth
>> related I would have expected that to put a stop to it.
>>
>> On 10/17/2016 15:16, Steven Hartland wrote:
>>> Be good to confirm its not an infinite loop by giving it a good bump
>>> first.
>>>
>>> On 17/10/2016 19:58, Karl Denninger wrote:
 I can certainly attempt setting that higher but is that not just
 hiding the problem rather than addressing it?


 On 10/17/2016 13:54, Steven Hartland wrote:
> You're hitting stack exhaustion, have you tried increasing the kernel
> stack pages?
> It can be changed from /boot/loader.conf
> kern.kstack_pages="6"
>
> Default on amd64 is 4 IIRC
>
> On 17/10/2016 19:08, Karl Denninger wrote:
>> The target (and devices that trigger this) are a pair of 4Gb 7200RPM
>> SATA rotating rust drives (zmirror) with each provider
>> geli-encrypted
>> (that is, the actual devices used for the pool create are the
>> .eli's)
>>
>> The machine generating the problem has both rotating rust devices
>> *and*
>> SSDs, so I can't simply shut TRIM off system-wide and call it a
>> day as
>> TRIM itself is heavily-used; both the boot/root pools and a
>> Postgresql
>> database pool are on SSDs, while several terabytes of lesser-used
>> data
>> is on a pool of Raidz2 that is made up of spinning rust.
> snip...
>> NewFS.denninger.net dumped core - see /var/crash/vmcore.1
>>
>> Mon Oct 17 09:02:33 CDT 2016
>>
>> FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
>> r307318M: Fri Oct 14 09:23:46 CDT 2016
>> 

Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Steven Hartland

On 17/10/2016 20:52, Andriy Gapon wrote:

On 17/10/2016 21:54, Steven Hartland wrote:

You're hitting stack exhaustion, have you tried increasing the kernel stack 
pages?
It can be changed from /boot/loader.conf
kern.kstack_pages="6"

Default on amd64 is 4 IIRC

Steve,

perhaps you can think of a more proper fix? :-)
https://lists.freebsd.org/pipermail/freebsd-stable/2016-July/085047.html
Yes need to find some time to have a look at it, but given how rare this 
is and with TRIM being re-implemented upstream in a totally different 
manor I'm reticent to spend any real time on it.

On 17/10/2016 19:08, Karl Denninger wrote:

The target (and devices that trigger this) are a pair of 4Gb 7200RPM
SATA rotating rust drives (zmirror) with each provider geli-encrypted
(that is, the actual devices used for the pool create are the .eli's)

The machine generating the problem has both rotating rust devices *and*
SSDs, so I can't simply shut TRIM off system-wide and call it a day as
TRIM itself is heavily-used; both the boot/root pools and a Postgresql
database pool are on SSDs, while several terabytes of lesser-used data
is on a pool of Raidz2 that is made up of spinning rust.

snip...

NewFS.denninger.net dumped core - see /var/crash/vmcore.1

Mon Oct 17 09:02:33 CDT 2016

FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
r307318M: Fri Oct 14 09:23:46 CDT 2016
k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64

panic: double fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:

Fatal double fault
rip = 0x8220d9ec
rsp = 0xfe066821f000
rbp = 0xfe066821f020
cpuid = 6; apic id = 14
panic: double fault
cpuid = 6
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe0649d78e30
vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
panic() at panic+0x43/frame 0xfe0649d78f10
dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
--- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000, rbp =
0xfe066821f020 ---
avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
0xfe066821f530
vdev_queue_io_done() at vdev_queue_io_done+0x83/frame 0xfe066821f570
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f650
zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f6e0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f7c0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f850
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f930
zio_execute() at zio_execute+0x23d/frame 0xfe066821f980
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f9c0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f9f0
zio_execute() at zio_execute+0x23d/frame 0xfe066821fa40
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821faa0
zio_execute() at zio_execute+0x23d/frame 0xfe066821faf0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fb30
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fb60
zio_execute() at zio_execute+0x23d/frame 0xfe066821fbb0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fc10
zio_execute() at zio_execute+0x23d/frame 0xfe066821fc60
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fca0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fcd0
zio_execute() at zio_execute+0x23d/frame 0xfe066821fd20
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fd80
zio_execute() at zio_execute+0x23d/frame 0xfe066821fdd0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fe10
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fe40
zio_execute() at zio_execute+0x23d/frame 0xfe066821fe90
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fef0
zio_execute() at zio_execute+0x23d/frame 

Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Steven Hartland
Setting those values will only effect what's queued to the device not 
what's actually outstanding.


On 17/10/2016 21:22, Karl Denninger wrote:

Since I cleared it (by setting TRIM off on the test machine, rebooting,
importing the pool and noting that it did not panic -- pulled drives,
re-inserted into the production machine and ran backup routine -- all
was normal) it may be a while before I see it again (a week or so is usual.)

It appears to be related to entropy in the filesystem that comes up as
"eligible" to be removed from the backup volume, which (not
surprisingly) tends to happen a few days after I do a new world build or
something similar (the daily and/or periodic snapshots roll off at about
that point.)

I don't happen to have a spare pair of high-performance SSDs I can stick
in the sandbox machine in an attempt to force the condition to assert
itself in test, unfortunately.

I *am* concerned that it's not "simple" stack exhaustion because setting
the max outstanding TRIMs on a per-vdev basis down quite-dramatically
did *not* prevent it from happening -- and if it was simply stack depth
related I would have expected that to put a stop to it.

On 10/17/2016 15:16, Steven Hartland wrote:

Be good to confirm its not an infinite loop by giving it a good bump
first.

On 17/10/2016 19:58, Karl Denninger wrote:

I can certainly attempt setting that higher but is that not just
hiding the problem rather than addressing it?


On 10/17/2016 13:54, Steven Hartland wrote:

You're hitting stack exhaustion, have you tried increasing the kernel
stack pages?
It can be changed from /boot/loader.conf
kern.kstack_pages="6"

Default on amd64 is 4 IIRC

On 17/10/2016 19:08, Karl Denninger wrote:

The target (and devices that trigger this) are a pair of 4Gb 7200RPM
SATA rotating rust drives (zmirror) with each provider geli-encrypted
(that is, the actual devices used for the pool create are the .eli's)

The machine generating the problem has both rotating rust devices
*and*
SSDs, so I can't simply shut TRIM off system-wide and call it a day as
TRIM itself is heavily-used; both the boot/root pools and a Postgresql
database pool are on SSDs, while several terabytes of lesser-used data
is on a pool of Raidz2 that is made up of spinning rust.

snip...

NewFS.denninger.net dumped core - see /var/crash/vmcore.1

Mon Oct 17 09:02:33 CDT 2016

FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
r307318M: Fri Oct 14 09:23:46 CDT 2016
k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64

panic: double fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:

Fatal double fault
rip = 0x8220d9ec
rsp = 0xfe066821f000
rbp = 0xfe066821f020
cpuid = 6; apic id = 14
panic: double fault
cpuid = 6
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe0649d78e30
vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
panic() at panic+0x43/frame 0xfe0649d78f10
dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
--- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000,
rbp =
0xfe066821f020 ---
avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
0xfe066821f530
vdev_queue_io_done() at vdev_queue_io_done+0x83/frame
0xfe066821f570
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame
0xfe066821f650
zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame
0xfe066821f6e0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame
0xfe066821f7c0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame
0xfe066821f850
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame
0xfe066821f930
zio_execute() at zio_execute+0x23d/frame 0xfe066821f980
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame
0xfe066821f9c0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f9f0
zio_execute() at zio_execute+0x23d/frame 0xfe066821fa40
zio_vdev_io_start() at 

Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Karl Denninger
Since I cleared it (by setting TRIM off on the test machine, rebooting,
importing the pool and noting that it did not panic -- pulled drives,
re-inserted into the production machine and ran backup routine -- all
was normal) it may be a while before I see it again (a week or so is usual.)

It appears to be related to entropy in the filesystem that comes up as
"eligible" to be removed from the backup volume, which (not
surprisingly) tends to happen a few days after I do a new world build or
something similar (the daily and/or periodic snapshots roll off at about
that point.)

I don't happen to have a spare pair of high-performance SSDs I can stick
in the sandbox machine in an attempt to force the condition to assert
itself in test, unfortunately.

I *am* concerned that it's not "simple" stack exhaustion because setting
the max outstanding TRIMs on a per-vdev basis down quite-dramatically
did *not* prevent it from happening -- and if it was simply stack depth
related I would have expected that to put a stop to it.

On 10/17/2016 15:16, Steven Hartland wrote:
> Be good to confirm its not an infinite loop by giving it a good bump
> first.
>
> On 17/10/2016 19:58, Karl Denninger wrote:
>> I can certainly attempt setting that higher but is that not just
>> hiding the problem rather than addressing it?
>>
>>
>> On 10/17/2016 13:54, Steven Hartland wrote:
>>> You're hitting stack exhaustion, have you tried increasing the kernel
>>> stack pages?
>>> It can be changed from /boot/loader.conf
>>> kern.kstack_pages="6"
>>>
>>> Default on amd64 is 4 IIRC
>>>
>>> On 17/10/2016 19:08, Karl Denninger wrote:
 The target (and devices that trigger this) are a pair of 4Gb 7200RPM
 SATA rotating rust drives (zmirror) with each provider geli-encrypted
 (that is, the actual devices used for the pool create are the .eli's)

 The machine generating the problem has both rotating rust devices
 *and*
 SSDs, so I can't simply shut TRIM off system-wide and call it a day as
 TRIM itself is heavily-used; both the boot/root pools and a Postgresql
 database pool are on SSDs, while several terabytes of lesser-used data
 is on a pool of Raidz2 that is made up of spinning rust.
>>> snip...
 NewFS.denninger.net dumped core - see /var/crash/vmcore.1

 Mon Oct 17 09:02:33 CDT 2016

 FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
 r307318M: Fri Oct 14 09:23:46 CDT 2016
 k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64

 panic: double fault

 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and
 you are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for
 details.
 This GDB was configured as "amd64-marcel-freebsd"...

 Unread portion of the kernel message buffer:

 Fatal double fault
 rip = 0x8220d9ec
 rsp = 0xfe066821f000
 rbp = 0xfe066821f020
 cpuid = 6; apic id = 14
 panic: double fault
 cpuid = 6
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
 0xfe0649d78e30
 vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
 panic() at panic+0x43/frame 0xfe0649d78f10
 dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
 Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
 --- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000,
 rbp =
 0xfe066821f020 ---
 avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
 avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
 vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
 0xfe066821f530
 vdev_queue_io_done() at vdev_queue_io_done+0x83/frame
 0xfe066821f570
 zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
 zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
 zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame
 0xfe066821f650
 zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
 vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame
 0xfe066821f6e0
 zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
 zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
 zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame
 0xfe066821f7c0
 zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
 vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame
 0xfe066821f850
 zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
 zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
 zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame
 0xfe066821f930
 zio_execute() at 

Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Steven Hartland

Be good to confirm its not an infinite loop by giving it a good bump first.

On 17/10/2016 19:58, Karl Denninger wrote:

I can certainly attempt setting that higher but is that not just
hiding the problem rather than addressing it?


On 10/17/2016 13:54, Steven Hartland wrote:

You're hitting stack exhaustion, have you tried increasing the kernel
stack pages?
It can be changed from /boot/loader.conf
kern.kstack_pages="6"

Default on amd64 is 4 IIRC

On 17/10/2016 19:08, Karl Denninger wrote:

The target (and devices that trigger this) are a pair of 4Gb 7200RPM
SATA rotating rust drives (zmirror) with each provider geli-encrypted
(that is, the actual devices used for the pool create are the .eli's)

The machine generating the problem has both rotating rust devices *and*
SSDs, so I can't simply shut TRIM off system-wide and call it a day as
TRIM itself is heavily-used; both the boot/root pools and a Postgresql
database pool are on SSDs, while several terabytes of lesser-used data
is on a pool of Raidz2 that is made up of spinning rust.

snip...

NewFS.denninger.net dumped core - see /var/crash/vmcore.1

Mon Oct 17 09:02:33 CDT 2016

FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
r307318M: Fri Oct 14 09:23:46 CDT 2016
k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64

panic: double fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:

Fatal double fault
rip = 0x8220d9ec
rsp = 0xfe066821f000
rbp = 0xfe066821f020
cpuid = 6; apic id = 14
panic: double fault
cpuid = 6
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe0649d78e30
vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
panic() at panic+0x43/frame 0xfe0649d78f10
dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
--- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000, rbp =
0xfe066821f020 ---
avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
0xfe066821f530
vdev_queue_io_done() at vdev_queue_io_done+0x83/frame 0xfe066821f570
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f650
zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f6e0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f7c0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f850
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f930
zio_execute() at zio_execute+0x23d/frame 0xfe066821f980
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f9c0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f9f0
zio_execute() at zio_execute+0x23d/frame 0xfe066821fa40
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821faa0
zio_execute() at zio_execute+0x23d/frame 0xfe066821faf0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fb30
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fb60
zio_execute() at zio_execute+0x23d/frame 0xfe066821fbb0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fc10
zio_execute() at zio_execute+0x23d/frame 0xfe066821fc60
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fca0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fcd0
zio_execute() at zio_execute+0x23d/frame 0xfe066821fd20
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fd80
zio_execute() at zio_execute+0x23d/frame 0xfe066821fdd0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fe10
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fe40
zio_execute() at zio_execute+0x23d/frame 0xfe066821fe90
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fef0
zio_execute() at zio_execute+0x23d/frame 0xfe066821ff40
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821ff80
zio_vdev_io_done() at 

Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Andriy Gapon
On 17/10/2016 21:54, Steven Hartland wrote:
> You're hitting stack exhaustion, have you tried increasing the kernel stack 
> pages?
> It can be changed from /boot/loader.conf
> kern.kstack_pages="6"
> 
> Default on amd64 is 4 IIRC

Steve,

perhaps you can think of a more proper fix? :-)
https://lists.freebsd.org/pipermail/freebsd-stable/2016-July/085047.html

> On 17/10/2016 19:08, Karl Denninger wrote:
>> The target (and devices that trigger this) are a pair of 4Gb 7200RPM
>> SATA rotating rust drives (zmirror) with each provider geli-encrypted
>> (that is, the actual devices used for the pool create are the .eli's)
>>
>> The machine generating the problem has both rotating rust devices *and*
>> SSDs, so I can't simply shut TRIM off system-wide and call it a day as
>> TRIM itself is heavily-used; both the boot/root pools and a Postgresql
>> database pool are on SSDs, while several terabytes of lesser-used data
>> is on a pool of Raidz2 that is made up of spinning rust.
> snip...
>>
>> NewFS.denninger.net dumped core - see /var/crash/vmcore.1
>>
>> Mon Oct 17 09:02:33 CDT 2016
>>
>> FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
>> r307318M: Fri Oct 14 09:23:46 CDT 2016
>> k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64
>>
>> panic: double fault
>>
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>> This GDB was configured as "amd64-marcel-freebsd"...
>>
>> Unread portion of the kernel message buffer:
>>
>> Fatal double fault
>> rip = 0x8220d9ec
>> rsp = 0xfe066821f000
>> rbp = 0xfe066821f020
>> cpuid = 6; apic id = 14
>> panic: double fault
>> cpuid = 6
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfe0649d78e30
>> vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
>> panic() at panic+0x43/frame 0xfe0649d78f10
>> dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
>> Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
>> --- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000, rbp =
>> 0xfe066821f020 ---
>> avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
>> avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
>> vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
>> 0xfe066821f530
>> vdev_queue_io_done() at vdev_queue_io_done+0x83/frame 0xfe066821f570
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f650
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f6e0
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f7c0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f850
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f930
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f980
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f9c0
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f9f0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fa40
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821faa0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821faf0
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fb30
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fb60
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fbb0
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fc10
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fc60
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fca0
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fcd0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fd20
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fd80
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fdd0
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fe10
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fe40
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fe90
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fef0
>> zio_execute() 

Jenkins build is still unstable: FreeBSD_stable_10 #428

2016-10-17 Thread jenkins-admin
https://jenkins.FreeBSD.org/job/FreeBSD_stable_10/428/
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Karl Denninger
I can certainly attempt setting that higher but is that not just
hiding the problem rather than addressing it?


On 10/17/2016 13:54, Steven Hartland wrote:
> You're hitting stack exhaustion, have you tried increasing the kernel
> stack pages?
> It can be changed from /boot/loader.conf
> kern.kstack_pages="6"
>
> Default on amd64 is 4 IIRC
>
> On 17/10/2016 19:08, Karl Denninger wrote:
>> The target (and devices that trigger this) are a pair of 4Gb 7200RPM
>> SATA rotating rust drives (zmirror) with each provider geli-encrypted
>> (that is, the actual devices used for the pool create are the .eli's)
>>
>> The machine generating the problem has both rotating rust devices *and*
>> SSDs, so I can't simply shut TRIM off system-wide and call it a day as
>> TRIM itself is heavily-used; both the boot/root pools and a Postgresql
>> database pool are on SSDs, while several terabytes of lesser-used data
>> is on a pool of Raidz2 that is made up of spinning rust.
> snip...
>>
>> NewFS.denninger.net dumped core - see /var/crash/vmcore.1
>>
>> Mon Oct 17 09:02:33 CDT 2016
>>
>> FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
>> r307318M: Fri Oct 14 09:23:46 CDT 2016
>> k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64
>>
>> panic: double fault
>>
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
>> you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for
>> details.
>> This GDB was configured as "amd64-marcel-freebsd"...
>>
>> Unread portion of the kernel message buffer:
>>
>> Fatal double fault
>> rip = 0x8220d9ec
>> rsp = 0xfe066821f000
>> rbp = 0xfe066821f020
>> cpuid = 6; apic id = 14
>> panic: double fault
>> cpuid = 6
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfe0649d78e30
>> vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
>> panic() at panic+0x43/frame 0xfe0649d78f10
>> dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
>> Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
>> --- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000, rbp =
>> 0xfe066821f020 ---
>> avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
>> avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
>> vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
>> 0xfe066821f530
>> vdev_queue_io_done() at vdev_queue_io_done+0x83/frame 0xfe066821f570
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f650
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f6e0
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f7c0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f850
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f930
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f980
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f9c0
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f9f0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fa40
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821faa0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821faf0
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fb30
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fb60
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fbb0
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fc10
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fc60
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fca0
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fcd0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fd20
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fd80
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fdd0
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fe10
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fe40
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fe90
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fef0
>> zio_execute() at 

Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Steven Hartland
You're hitting stack exhaustion, have you tried increasing the kernel 
stack pages?

It can be changed from /boot/loader.conf
kern.kstack_pages="6"

Default on amd64 is 4 IIRC

On 17/10/2016 19:08, Karl Denninger wrote:

The target (and devices that trigger this) are a pair of 4Gb 7200RPM
SATA rotating rust drives (zmirror) with each provider geli-encrypted
(that is, the actual devices used for the pool create are the .eli's)

The machine generating the problem has both rotating rust devices *and*
SSDs, so I can't simply shut TRIM off system-wide and call it a day as
TRIM itself is heavily-used; both the boot/root pools and a Postgresql
database pool are on SSDs, while several terabytes of lesser-used data
is on a pool of Raidz2 that is made up of spinning rust.

snip...


NewFS.denninger.net dumped core - see /var/crash/vmcore.1

Mon Oct 17 09:02:33 CDT 2016

FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
r307318M: Fri Oct 14 09:23:46 CDT 2016
k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64

panic: double fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:

Fatal double fault
rip = 0x8220d9ec
rsp = 0xfe066821f000
rbp = 0xfe066821f020
cpuid = 6; apic id = 14
panic: double fault
cpuid = 6
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe0649d78e30
vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
panic() at panic+0x43/frame 0xfe0649d78f10
dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
--- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000, rbp =
0xfe066821f020 ---
avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
0xfe066821f530
vdev_queue_io_done() at vdev_queue_io_done+0x83/frame 0xfe066821f570
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f650
zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f6e0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f7c0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f850
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f930
zio_execute() at zio_execute+0x23d/frame 0xfe066821f980
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f9c0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f9f0
zio_execute() at zio_execute+0x23d/frame 0xfe066821fa40
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821faa0
zio_execute() at zio_execute+0x23d/frame 0xfe066821faf0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fb30
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fb60
zio_execute() at zio_execute+0x23d/frame 0xfe066821fbb0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fc10
zio_execute() at zio_execute+0x23d/frame 0xfe066821fc60
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fca0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fcd0
zio_execute() at zio_execute+0x23d/frame 0xfe066821fd20
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fd80
zio_execute() at zio_execute+0x23d/frame 0xfe066821fdd0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fe10
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fe40
zio_execute() at zio_execute+0x23d/frame 0xfe066821fe90
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fef0
zio_execute() at zio_execute+0x23d/frame 0xfe066821ff40
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821ff80
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821ffb0
zio_execute() at zio_execute+0x23d/frame 0xfe066822
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe0668220060
zio_execute() at zio_execute+0x23d/frame 0xfe06682200b0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 

Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Karl Denninger
The target (and devices that trigger this) are a pair of 4Gb 7200RPM
SATA rotating rust drives (zmirror) with each provider geli-encrypted
(that is, the actual devices used for the pool create are the .eli's)

The machine generating the problem has both rotating rust devices *and*
SSDs, so I can't simply shut TRIM off system-wide and call it a day as
TRIM itself is heavily-used; both the boot/root pools and a Postgresql
database pool are on SSDs, while several terabytes of lesser-used data
is on a pool of Raidz2 that is made up of spinning rust.

vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1
vfs.zfs.vdev.trim_max_pending: 1
vfs.zfs.vdev.trim_max_active: 64
vfs.zfs.vdev.trim_min_active: 1
vfs.zfs.vdev.trim_on_init: 1
kstat.zfs.misc.zio_trim.failed: 0
kstat.zfs.misc.zio_trim.unsupported: 1080
kstat.zfs.misc.zio_trim.success: 573768
kstat.zfs.misc.zio_trim.bytes: 28964282368

The machine in question has been up for ~3 hours now since the last
panic, so obviously TRIM is being heavily used...

The issue, once the problem has been created, is *portable* and it is
not being caused by the SSD source drives.  That is, once the machine
panics if I remove the two disks that form the backup pool, physically
move them to my sandbox machine, geli attach the drives and import the
pool within seconds the second machine will panic in the identical
fashion.  It's possible (but have not proved) that if I were to reboot
enough times the filesystem would eventually reach consistency with the
removed snapshots all gone and the panics would stop, but I got a
half-dozen of them sequentially this morning on my test machine so I'm
not at all sure how many more I'd need to allow to run, or whether *any*
of the removals committed before the panic (if not then the cycle of
reboot/attach/panic would never end) :-)

Reducing trim_max_active (to 10, a quite-drastic reduction) did not stop
the panics.

What appears to be happening is that the removal of the datasets in
question on a reasonably newly-imported pool, whether it occurs by the
incremental zfs recv -Fudv or by zfs destroy -r from the command line,
generates a large number of TRIM requests which are of course rejected
by the providers as spinning rust does not support them.  However the
attempt to queue them generates a stack overflow and double-fault panic
as a result, and since once the command is issued the filesystem now has
the deletes pending and the consistent state is in fact with them gone,
any attempt to reattach the drives with TRIM enabled can result in an
immediate additional panic.

I tried to work around this in my backup script by creating and then
destroying a file on the backup volume, then sleeping for a few seconds
before the backup actually commenced, in the hope that this would (1)
trigger a TRIM attempt and (2) lead the system to recognize that the
target volume cannot support TRIM and thus stop trying to do so (and
thus not lead to the storm that exhausts the stack and panic.)  That
approach, however (see below), failed to prevent the problem.

#
# Now try to trigger TRIM so that we don't have a storm of them
#
echo "Attempting to disable TRIM on spinning rust"

mount -t zfs backup/no-trim /mnt
dd if=/dev/random of=/mnt/kill-trim bs=128k count=2
echo "Performed 2 writes"
sleep 2
rm /mnt/kill-trim
echo "Performed delete of written file"
sleep 5
umount /mnt
echo "Unmounted temporary filesystem"
sleep 2
echo "TRIM disable theoretically done"


On 10/17/2016 12:43, Warner Losh wrote:
> what's your underlying media?
>
> Warner
>
>
> On Mon, Oct 17, 2016 at 10:02 AM, Karl Denninger  wrote:
>> Update from my test system:
>>
>> Setting vfs.zfs.vdev_trim_max_active to 10 (from default 64) does *not*
>> stop the panics.
>>
>> Setting vfs.zfs.vdev.trim.enabled = 0 (which requires a reboot) DOES
>> stop the panics.
>>
>> I am going to run a scrub on the pack, but I suspect the pack itself
>> (now that I can actually mount it without the machine blowing up!) is fine.
>>
>> THIS (OBVIOUSLY) NEEDS ATTENTION!
>>
>> On 10/17/2016 09:17, Karl Denninger wrote:
>>> This is a situation I've had happen before, and reported -- it appeared
>>> to be a kernel stack overflow, and it has gotten materially worse on
>>> 11.0-STABLE.
>>>
>>> The issue occurs after some period of time (normally a week or so.)  The
>>> system has a mirrored pair of large drives used for backup purposes to
>>> which ZFS snapshots are written using a script that iterates over the
>>> system.
>>>
>>> The panic /only /happens when the root filesystem is being sent, and it
>>> appears that the panic itself is being triggered by an I/O pattern on
>>> the /backup /drive -- not the source drives.  Zpool scrubs on the source
>>> are clean; I am going to run one now on the backup, but in the past that
>>> has been clean as well.
>>>
>>> I now have a *repeatable* panic in that if I attempt a "zfs list -rt all
>>> 

Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Warner Losh
what's your underlying media?

Warner


On Mon, Oct 17, 2016 at 10:02 AM, Karl Denninger  wrote:
> Update from my test system:
>
> Setting vfs.zfs.vdev_trim_max_active to 10 (from default 64) does *not*
> stop the panics.
>
> Setting vfs.zfs.vdev.trim.enabled = 0 (which requires a reboot) DOES
> stop the panics.
>
> I am going to run a scrub on the pack, but I suspect the pack itself
> (now that I can actually mount it without the machine blowing up!) is fine.
>
> THIS (OBVIOUSLY) NEEDS ATTENTION!
>
> On 10/17/2016 09:17, Karl Denninger wrote:
>> This is a situation I've had happen before, and reported -- it appeared
>> to be a kernel stack overflow, and it has gotten materially worse on
>> 11.0-STABLE.
>>
>> The issue occurs after some period of time (normally a week or so.)  The
>> system has a mirrored pair of large drives used for backup purposes to
>> which ZFS snapshots are written using a script that iterates over the
>> system.
>>
>> The panic /only /happens when the root filesystem is being sent, and it
>> appears that the panic itself is being triggered by an I/O pattern on
>> the /backup /drive -- not the source drives.  Zpool scrubs on the source
>> are clean; I am going to run one now on the backup, but in the past that
>> has been clean as well.
>>
>> I now have a *repeatable* panic in that if I attempt a "zfs list -rt all
>> backup" on the backup volume I get the below panic.  A "zfs list"
>> does*__*not panic the system.
>>
>> The operating theory previously (after digging through the passed
>> structures in the dump) was that the ZFS system was attempting to issue
>> TRIMs on a device that can't do them before the ZFS system realizes this
>> and stops asking (the backup volume is comprised of spinning rust) but
>> the appearance of the panic now on the volume when I simply do a "zfs
>> list -rt all backup" appears to negate that theory since no writes are
>> performed by that operation, and thus no TRIM calls should be issued.
>>
>> I can leave the backup volume in the state that causes this for a short
>> period of time in an attempt to find and fix this.
>>
>>
>> NewFS.denninger.net dumped core - see /var/crash/vmcore.1
>>
>> Mon Oct 17 09:02:33 CDT 2016
>>
>> FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
>> r307318M: Fri Oct 14 09:23:46 CDT 2016
>> k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64
>>
>> panic: double fault
>>
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>> This GDB was configured as "amd64-marcel-freebsd"...
>>
>> Unread portion of the kernel message buffer:
>>
>> Fatal double fault
>> rip = 0x8220d9ec
>> rsp = 0xfe066821f000
>> rbp = 0xfe066821f020
>> cpuid = 6; apic id = 14
>> panic: double fault
>> cpuid = 6
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfe0649d78e30
>> vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
>> panic() at panic+0x43/frame 0xfe0649d78f10
>> dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
>> Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
>> --- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000, rbp =
>> 0xfe066821f020 ---
>> avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
>> avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
>> vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
>> 0xfe066821f530
>> vdev_queue_io_done() at vdev_queue_io_done+0x83/frame 0xfe066821f570
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f650
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f6e0
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f7c0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f850
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
>> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f930
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821f980
>> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f9c0
>> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f9f0
>> zio_execute() at zio_execute+0x23d/frame 0xfe066821fa40
>> 

Re: I'm upset about FreeBSD

2016-10-17 Thread Yamagi Burmeister
On Mon, 17 Oct 2016 03:44:14 +0300
Rostislav Krasny  wrote:

> First of all I faced an old problem that I reported here a year ago:
> http://comments.gmane.org/gmane.os.freebsd.stable/96598
> Completely new USB flash drive flashed by the
> FreeBSD-11.0-RELEASE-i386-mini-memstick.img file kills every Windows
> again. If I use the Rufus util to write the img file (using DD mode)
> the Windows dies immediately after the flashing. If I use the
> Win32DiskImager (suggested by the Handbook) it doesn't reinitialize
> the USB storage and Windows dies only if I remove and put that USB
> flash drive again or boot Windows when it is connected. Nothing was
> done to fix this nasty bug for a year.

As was already said in the other answers this is a bug in Windows.
Particulary in the partition parser. partmgr.sys (running in kernel
mode) crashes while parsing the FreeBSD installation images GPT
setup. This may be a variant of the bug known as "Kindle is crashing
Win 10":

http://answers.microsoft.com/en-us/windows/forum/windows_10-performance/plugging-in-kindle-is-crashing-windows-10-after/5db0d867-0822-4512-919e-3d7786353f95?page=1

That bug was patched on september 13 and I'm unable to reproduce the
crash on a fully patched Win 10 VM. But there's no patch for Win 7,
even with all patches applied my Win 7 VM is still crashing as soon
as the FreeBSD installation image is connected.

I did some debugging and I'm pretty sure that the problem is not the
pmbr used for classic BIOS boot but the GPT itself. But my knowledge
of GPT and especially Windows internals is limit. So maybe someone 
with more insight can look into this. 

Or even better: Complain to Microsoft. Even if the GPT is invalid it
should crash the kernel.

Regards,
Yamagi

-- 
Homepage:  www.yamagi.org
XMPP:  yam...@yamagi.org
GnuPG/GPG: 0xEFBCCBCB
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Jenkins build is unstable: FreeBSD_stable_10 #427

2016-10-17 Thread jenkins-admin
https://jenkins.FreeBSD.org/job/FreeBSD_stable_10/427/
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: moving ezjail-based jails from 10.3 host to 11.0 host

2016-10-17 Thread Miroslav Lachman

Marko Cupać wrote on 2016/10/17 17:43:

[...]


Now, to give answer to my own question: everything works fine, after -
of course - reinstalling all the packages with `pkg-static upgrade -f'.


Reinstalling all packages?
So you didn't just move jails from one host to another but you upgraded 
them to 11.0 also?
Because I had feeling that you want just to move them to another 
machine. And if you just move them (completely with shared basejail) you 
don't need to reinstall packages.


As Oliver said you can use rsync to transfer them without need to 
compress + scp + uncompress. It is much faster transfer than making 
archives.


Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Karl Denninger
Update from my test system:

Setting vfs.zfs.vdev_trim_max_active to 10 (from default 64) does *not*
stop the panics.

Setting vfs.zfs.vdev.trim.enabled = 0 (which requires a reboot) DOES
stop the panics.

I am going to run a scrub on the pack, but I suspect the pack itself
(now that I can actually mount it without the machine blowing up!) is fine.

THIS (OBVIOUSLY) NEEDS ATTENTION!

On 10/17/2016 09:17, Karl Denninger wrote:
> This is a situation I've had happen before, and reported -- it appeared
> to be a kernel stack overflow, and it has gotten materially worse on
> 11.0-STABLE.
>
> The issue occurs after some period of time (normally a week or so.)  The
> system has a mirrored pair of large drives used for backup purposes to
> which ZFS snapshots are written using a script that iterates over the
> system.
>
> The panic /only /happens when the root filesystem is being sent, and it
> appears that the panic itself is being triggered by an I/O pattern on
> the /backup /drive -- not the source drives.  Zpool scrubs on the source
> are clean; I am going to run one now on the backup, but in the past that
> has been clean as well.
>
> I now have a *repeatable* panic in that if I attempt a "zfs list -rt all
> backup" on the backup volume I get the below panic.  A "zfs list"
> does*__*not panic the system.
>
> The operating theory previously (after digging through the passed
> structures in the dump) was that the ZFS system was attempting to issue
> TRIMs on a device that can't do them before the ZFS system realizes this
> and stops asking (the backup volume is comprised of spinning rust) but
> the appearance of the panic now on the volume when I simply do a "zfs
> list -rt all backup" appears to negate that theory since no writes are
> performed by that operation, and thus no TRIM calls should be issued.
>
> I can leave the backup volume in the state that causes this for a short
> period of time in an attempt to find and fix this.
>
>
> NewFS.denninger.net dumped core - see /var/crash/vmcore.1
>
> Mon Oct 17 09:02:33 CDT 2016
>
> FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
> r307318M: Fri Oct 14 09:23:46 CDT 2016
> k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64
>
> panic: double fault
>
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
>
> Unread portion of the kernel message buffer:
>
> Fatal double fault
> rip = 0x8220d9ec
> rsp = 0xfe066821f000
> rbp = 0xfe066821f020
> cpuid = 6; apic id = 14
> panic: double fault
> cpuid = 6
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe0649d78e30
> vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
> panic() at panic+0x43/frame 0xfe0649d78f10
> dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
> Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
> --- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000, rbp =
> 0xfe066821f020 ---
> avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
> avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
> vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
> 0xfe066821f530
> vdev_queue_io_done() at vdev_queue_io_done+0x83/frame 0xfe066821f570
> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
> zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f650
> zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f6e0
> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
> zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f7c0
> zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f850
> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
> zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f930
> zio_execute() at zio_execute+0x23d/frame 0xfe066821f980
> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f9c0
> zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f9f0
> zio_execute() at zio_execute+0x23d/frame 0xfe066821fa40
> zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821faa0
> zio_execute() at zio_execute+0x23d/frame 0xfe066821faf0
> vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fb30
> 

Re: moving ezjail-based jails from 10.3 host to 11.0 host

2016-10-17 Thread Marko Cupać
On Mon, 17 Oct 2016 15:55:07 +0200
Oliver Peter  wrote:

> On Mon, Oct 17, 2016 at 03:37:08PM +0200, Marko Cupać wrote:
> > I have 10.3 host which runs a dozen or so ezjail-based jails. I have
> > installed another 11.0 host, and I'd like to move jails to it.  
> 
> I would switch them to iocage+zfs, ezjail is sooo 90s. :)
> Have a look at the documentation:
>   http://iocage.readthedocs.io/en/latest/basic-use.html
> 
> All jail settings are stored within ZFS properties so an upcoming
> migration would only need a zfs send | zfs receive.
> 
> > Can I just archive jails on 10.3, scp them to 11.0, and re-create
> > them there by restoring from archive (-a switch)?  
> 
> Further I would recommend to use rsync -av instead of scp.

Oliver,

I do appreciate you took the time to respond to my question. However,
when I asked how to move my ezjail-based jails, I meant exactly that.
I did not ask which is the best jail management system (for my use case
I'm completely fine with ezjail), or about the good and bad things
about ZFS (I have hardware RAID controller which can't do JBOD on this
server), or advantages of rsync over scp (doesn't make much difference
for one-time transfer of single .tar.gz file over LAN).

Now, to give answer to my own question: everything works fine, after -
of course - reinstalling all the packages with `pkg-static upgrade -f'.

Regards,
-- 
Before enlightenment - chop wood, draw water.
After  enlightenment - chop wood, draw water.

Marko Cupać
https://www.mimar.rs/
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: update from 9.3 to 11.0

2016-10-17 Thread Zoran Kolic
> Index: head/usr.sbin/freebsd-update/freebsd-update.sh
> ===
> --- head/usr.sbin/freebsd-update/freebsd-update.sh  (revision 279900)
> +++ head/usr.sbin/freebsd-update/freebsd-update.sh  (revision 279901)
> @@ -1231,7 +1231,7 @@ fetch_metadata_sanity () {
># Some aliases to save space later: ${P} is a character which can
># appear in a path; ${M} is the four numeric metadata fields; and
># ${H} is a sha256 hash.
> -   P="[-+./:=%@_[~[:alnum:]]"
> +   P="[-+./:=,%@_[~[:alnum:]]"
>M="[0-9]+\|[0-9]+\|[0-9]+\|[0-9]+"
>H="[0-9a-f]{64}"

Sorry for late reply, quite busy this days.
This "comma" makes the change in the line. I hope
it would be enough to upgrade streight to 11.0.
I assume there are steps to change etc files and
compiler also.
Whenever I upgarade, I have to solve nvidia driver
version needed for it, and some other smaller tasks.
On laptop, it goes like compiling x things for intel
2000. Not an easy job, and unnerving too.
Best regards

   Zoran

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Clandestine USB SD card slot

2016-10-17 Thread George Mitchell
On 10/17/16 11:02, George Mitchell wrote:
> [...]
> After setting hw.sdhci.debug=1 and hw.mmc.debug=1 in /etc/sysctl.conf
> and doing a verbose boot, then inserting and removing an SD card, all
> I get in "dmesg | egrep mmc\|sdhci" is:
> 
> sdhci_pci0:  mem 0xf0c6c000-0xf0c6c0ff irq 16 at device
> 20.7 on pci0
> sdhci_pci0: 1 slot(s) allocated
> sdhci_pci0:  mem 0xf0c6c000-0xf0c6c0ff irq 16 at device
> 20.7 on pci0
> sdhci_pci0-slot0: 50MHz 8bits 3.3V DMA
> sdhci_pci0-slot0: == REGISTER DUMP ==
> sdhci_pci0-slot0: Sys addr: 0x | Version:  0x1001
> sdhci_pci0-slot0: Blk size: 0x | Blk cnt:  0x
> sdhci_pci0-slot0: Argument: 0x | Trn mode: 0x
> sdhci_pci0-slot0: Present:  0x01f2 | Host ctl: 0x
> sdhci_pci0-slot0: Power:0x | Blk gap:  0x
> sdhci_pci0-slot0: Wake-up:  0x | Clock:0x
> sdhci_pci0-slot0: Timeout:  0x | Int stat: 0x
> sdhci_pci0-slot0: Int enab: 0x01ff00fb | Sig enab: 0x01ff00fb
> sdhci_pci0-slot0: AC12 err: 0x | Slot int: 0x00ff
> sdhci_pci0-slot0: Caps: 0x21de32b2 | Max curr: 0x00c80064
> sdhci_pci0-slot0: ===
> sdhci_pci0: 1 slot(s) allocated
> 
> (Same for "egrep mmc\|sdhci /var/log/messages".)
> 
> "pciconf -lv" suggests this is a:
> sdhci_pci0@pci0:0:20:7: class=0x080501 card=0x08651025 chip=0x78131022
> rev=0x01 hdr=0x00
> vendor = 'Advanced Micro Devices, Inc. [AMD]'
> device = 'FCH SD Flash Controller'
> class  = base peripheral
> subclass   = SD host controller
> 
> Are there some quirks I should define for this controller?  -- George
> 

For those coming in late:
FreeBSD 10.3-RELEASE-p7 #0: Thu Aug 11 18:38:15 UTC 2016 -- George
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: moving ezjail-based jails from 10.3 host to 11.0 host

2016-10-17 Thread Herbert J. Skuhra
Oliver Peter skrev:
> 
> On Mon, Oct 17, 2016 at 03:37:08PM +0200, Marko Cupać wrote:
>> I have 10.3 host which runs a dozen or so ezjail-based jails. I have
>> installed another 11.0 host, and I'd like to move jails to it.
> 
> I would switch them to iocage+zfs, ezjail is sooo 90s. :)
> Have a look at the documentation:
>   http://iocage.readthedocs.io/en/latest/basic-use.html

The github page says:

**No longer supported. iocage is being rewritten in a different
  language.

--
Herbert
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Clandestine USB SD card slot

2016-10-17 Thread George Mitchell
On 10/16/16 17:40, George Mitchell wrote:
> On 10/16/16 14:16, Warner Losh wrote:
>> On Sun, Oct 16, 2016 at 12:08 PM, Warren Block  wrote:
>>> On Sun, 16 Oct 2016, George Mitchell wrote:
>>>
> So not only is it (apparently) recognized, but the sdhci_pci driver
> attached to it!  But inserting or removing a card shows no activity.
> What's my next step?  -- George
>>>
>>>
>>> Is a device created for the empty reader?  It's worth trying to force a
>>> retaste of that device with 'true > /dev/daX' after the card is inserted.
>>
>> Don't look for da anything. Look for mmcsd something. The sdhci_pci
>> driver provides disks that are mmcsdX. Looks like card change
>> interrupts aren't happening, or there's something else making the
>> driver unhappy with the SDHCI controller though...
>>
>> Warner
>> [...]
> 
> No /dev/mm*; no log output on card insertion/removal even with
> sysctl hw.sdhci.debug=1.  Other sysctl info:
> 
> sysctl -a | grep sdhci
> devicesdhci
> hw.sdhci.enable_msi: 1
> hw.sdhci.debug: 1
> dev.sdhci_pci.0.%parent: pci0
> dev.sdhci_pci.0.%pnpinfo: vendor=0x1022 device=0x7813 subvendor=0x1025
> subdevice=0x0865 class=0x080501
> dev.sdhci_pci.0.%location: pci0:0:20:7
> dev.sdhci_pci.0.%driver: sdhci_pci
> dev.sdhci_pci.0.%desc: Generic SD HCI
> dev.sdhci_pci.%parent:
> 
> -- George
> 
After setting hw.sdhci.debug=1 and hw.mmc.debug=1 in /etc/sysctl.conf
and doing a verbose boot, then inserting and removing an SD card, all
I get in "dmesg | egrep mmc\|sdhci" is:

sdhci_pci0:  mem 0xf0c6c000-0xf0c6c0ff irq 16 at device
20.7 on pci0
sdhci_pci0: 1 slot(s) allocated
sdhci_pci0:  mem 0xf0c6c000-0xf0c6c0ff irq 16 at device
20.7 on pci0
sdhci_pci0-slot0: 50MHz 8bits 3.3V DMA
sdhci_pci0-slot0: == REGISTER DUMP ==
sdhci_pci0-slot0: Sys addr: 0x | Version:  0x1001
sdhci_pci0-slot0: Blk size: 0x | Blk cnt:  0x
sdhci_pci0-slot0: Argument: 0x | Trn mode: 0x
sdhci_pci0-slot0: Present:  0x01f2 | Host ctl: 0x
sdhci_pci0-slot0: Power:0x | Blk gap:  0x
sdhci_pci0-slot0: Wake-up:  0x | Clock:0x
sdhci_pci0-slot0: Timeout:  0x | Int stat: 0x
sdhci_pci0-slot0: Int enab: 0x01ff00fb | Sig enab: 0x01ff00fb
sdhci_pci0-slot0: AC12 err: 0x | Slot int: 0x00ff
sdhci_pci0-slot0: Caps: 0x21de32b2 | Max curr: 0x00c80064
sdhci_pci0-slot0: ===
sdhci_pci0: 1 slot(s) allocated

(Same for "egrep mmc\|sdhci /var/log/messages".)

"pciconf -lv" suggests this is a:
sdhci_pci0@pci0:0:20:7: class=0x080501 card=0x08651025 chip=0x78131022
rev=0x01 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'FCH SD Flash Controller'
class  = base peripheral
subclass   = SD host controller

Are there some quirks I should define for this controller?  -- George
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE

2016-10-17 Thread Karl Denninger
This is a situation I've had happen before, and reported -- it appeared
to be a kernel stack overflow, and it has gotten materially worse on
11.0-STABLE.

The issue occurs after some period of time (normally a week or so.)  The
system has a mirrored pair of large drives used for backup purposes to
which ZFS snapshots are written using a script that iterates over the
system.

The panic /only /happens when the root filesystem is being sent, and it
appears that the panic itself is being triggered by an I/O pattern on
the /backup /drive -- not the source drives.  Zpool scrubs on the source
are clean; I am going to run one now on the backup, but in the past that
has been clean as well.

I now have a *repeatable* panic in that if I attempt a "zfs list -rt all
backup" on the backup volume I get the below panic.  A "zfs list"
does*__*not panic the system.

The operating theory previously (after digging through the passed
structures in the dump) was that the ZFS system was attempting to issue
TRIMs on a device that can't do them before the ZFS system realizes this
and stops asking (the backup volume is comprised of spinning rust) but
the appearance of the panic now on the volume when I simply do a "zfs
list -rt all backup" appears to negate that theory since no writes are
performed by that operation, and thus no TRIM calls should be issued.

I can leave the backup volume in the state that causes this for a short
period of time in an attempt to find and fix this.


NewFS.denninger.net dumped core - see /var/crash/vmcore.1

Mon Oct 17 09:02:33 CDT 2016

FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #13
r307318M: Fri Oct 14 09:23:46 CDT 2016
k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP  amd64

panic: double fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:

Fatal double fault
rip = 0x8220d9ec
rsp = 0xfe066821f000
rbp = 0xfe066821f020
cpuid = 6; apic id = 14
panic: double fault
cpuid = 6
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe0649d78e30
vpanic() at vpanic+0x182/frame 0xfe0649d78eb0
panic() at panic+0x43/frame 0xfe0649d78f10
dblfault_handler() at dblfault_handler+0xa2/frame 0xfe0649d78f30
Xdblfault() at Xdblfault+0xac/frame 0xfe0649d78f30
--- trap 0x17, rip = 0x8220d9ec, rsp = 0xfe066821f000, rbp =
0xfe066821f020 ---
avl_rotation() at avl_rotation+0xc/frame 0xfe066821f020
avl_remove() at avl_remove+0x1c8/frame 0xfe066821f070
vdev_queue_io_to_issue() at vdev_queue_io_to_issue+0x87f/frame
0xfe066821f530
vdev_queue_io_done() at vdev_queue_io_done+0x83/frame 0xfe066821f570
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f5a0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f5f0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f650
zio_execute() at zio_execute+0x23d/frame 0xfe066821f6a0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f6e0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f710
zio_execute() at zio_execute+0x23d/frame 0xfe066821f760
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f7c0
zio_execute() at zio_execute+0x23d/frame 0xfe066821f810
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f850
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f880
zio_execute() at zio_execute+0x23d/frame 0xfe066821f8d0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821f930
zio_execute() at zio_execute+0x23d/frame 0xfe066821f980
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821f9c0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821f9f0
zio_execute() at zio_execute+0x23d/frame 0xfe066821fa40
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821faa0
zio_execute() at zio_execute+0x23d/frame 0xfe066821faf0
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fb30
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fb60
zio_execute() at zio_execute+0x23d/frame 0xfe066821fbb0
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fc10
zio_execute() at zio_execute+0x23d/frame 0xfe066821fc60
vdev_queue_io_done() at vdev_queue_io_done+0xcd/frame 0xfe066821fca0
zio_vdev_io_done() at zio_vdev_io_done+0xd9/frame 0xfe066821fcd0
zio_execute() at zio_execute+0x23d/frame 0xfe066821fd20
zio_vdev_io_start() at zio_vdev_io_start+0x34d/frame 0xfe066821fd80
zio_execute() at zio_execute+0x23d/frame 0xfe066821fdd0
vdev_queue_io_done() at 

Re: moving ezjail-based jails from 10.3 host to 11.0 host

2016-10-17 Thread Oliver Peter
On Mon, Oct 17, 2016 at 03:37:08PM +0200, Marko Cupać wrote:
> I have 10.3 host which runs a dozen or so ezjail-based jails. I have
> installed another 11.0 host, and I'd like to move jails to it.

I would switch them to iocage+zfs, ezjail is sooo 90s. :)
Have a look at the documentation:
http://iocage.readthedocs.io/en/latest/basic-use.html

All jail settings are stored within ZFS properties so an upcoming
migration would only need a zfs send | zfs receive.

> Can I just archive jails on 10.3, scp them to 11.0, and re-create them
> there by restoring from archive (-a switch)?

Further I would recommend to use rsync -av instead of scp.


-- 
Oliver PETER   oli...@gfuzz.de   0x456D688F
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: I'm upset about FreeBSD

2016-10-17 Thread Rostislav Krasny
On Mon, Oct 17, 2016 at 3:39 PM, Rostislav Krasny  wrote:
> On Mon, Oct 17, 2016 at 3:31 PM, krad  wrote:
>>
>> Does this just affect MBR layouts? If possible you might want to consider
>> UEFI booting for both windows and other os's, It's probably safer as you
>> dont need to plays with partitions and bootloaders.
>
> This is an old computer that doesn't support UEFI booting. In a past I
> tried older FreeBSD versions on it and I don't remember any boot issue
> with them. At least up to 9.X versions.

Ok. I dropped the FreeBSD ada0s2 slice and rewrote the MBR code by
MbrFix util from www.sysint.no/mbrfix
Then I tried to install FreeBSD 11.0 again. Previously I used the
manual partitioning and now I used the guided UFS partitioning. This
time FreeBSD was installed properly without any boot issue. FreeBSD
was booting straight away. After that I ran "boot0cfg -B ada0" and the
boot0 boot manager is working properly again: both F1 for Windows and
F2 for FreeBSD.

It's much better than in my first try. There probably is some bug in
the bsdinstall(8) when the manual partitioning is used instead of the
guided one.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


moving ezjail-based jails from 10.3 host to 11.0 host

2016-10-17 Thread Marko Cupać
Hi,

I have 10.3 host which runs a dozen or so ezjail-based jails. I have
installed another 11.0 host, and I'd like to move jails to it.

Can I just archive jails on 10.3, scp them to 11.0, and re-create them
there by restoring from archive (-a switch)?

Are there any additional actions I should perform?

Thank you in advance,
-- 
Before enlightenment - chop wood, draw water.
After  enlightenment - chop wood, draw water.

Marko Cupać
https://www.mimar.rs/
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Build failed in Jenkins: FreeBSD_stable_10 #426

2016-10-17 Thread jenkins-admin
https://jenkins.FreeBSD.org/job/FreeBSD_stable_10/426/--
[...truncated 105166 lines...]
--- PartialInlining.o ---
c++   -O2 -pipe 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/tools/clang/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/lib/Transforms/IPO
 -I. 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/../../lib/clang/include
 -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS 
-D__STDC_CONSTANT_MACROS -DNDEBUG -DCLANG_ENABLE_ARCMT -DCLANG_ENABLE_REWRITER 
-DCLANG_ENABLE_STATIC_ANALYZER -fno-strict-aliasing 
-DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd10.3\" 
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd10.3\" -DDEFAULT_SYSROOT=\"\" 
-Qunused-arguments -fstack-protector  -fno-exceptions -fno-rtti 
-Wno-c++11-extensions  -c 
/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/lib/Transforms/IPO/PartialInlining.cpp
 -o PartialInlining.o
--- all_subdir_libllvminstcombine ---
--- libllvminstcombine.a ---
building static llvminstcombine library
ranlib -D libllvminstcombine.a
--- all_subdir_libllvmlinker ---
===> lib/clang/libllvmlinker (all)
--- LinkModules.o ---
c++   -O2 -pipe 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmlinker/../../../contrib/llvm/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmlinker/../../../contrib/llvm/tools/clang/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmlinker/../../../contrib/llvm/lib/Linker
 -I. 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmlinker/../../../contrib/llvm/../../lib/clang/include
 -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS 
-D__STDC_CONSTANT_MACROS -DNDEBUG -DCLANG_ENABLE_ARCMT -DCLANG_ENABLE_REWRITER 
-DCLANG_ENABLE_STATIC_ANALYZER -fno-strict-aliasing 
-DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd10.3\" 
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd10.3\" -DDEFAULT_SYSROOT=\"\" 
-Qunused-arguments -fstack-protector  -fno-exceptions -fno-rtti 
-Wno-c++11-extensions  -c 
/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmlinker/../../../contrib/llvm/lib/Linker/LinkModules.cpp
 -o LinkModules.o
--- all_subdir_libllvmipo ---
--- PassManagerBuilder.o ---
c++   -O2 -pipe 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/tools/clang/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/lib/Transforms/IPO
 -I. 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/../../lib/clang/include
 -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS 
-D__STDC_CONSTANT_MACROS -DNDEBUG -DCLANG_ENABLE_ARCMT -DCLANG_ENABLE_REWRITER 
-DCLANG_ENABLE_STATIC_ANALYZER -fno-strict-aliasing 
-DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd10.3\" 
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd10.3\" -DDEFAULT_SYSROOT=\"\" 
-Qunused-arguments -fstack-protector  -fno-exceptions -fno-rtti 
-Wno-c++11-extensions  -c 
/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
 -o PassManagerBuilder.o
--- StripDeadPrototypes.o ---
c++   -O2 -pipe 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/tools/clang/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/lib/Transforms/IPO
 -I. 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/../../lib/clang/include
 -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS 
-D__STDC_CONSTANT_MACROS -DNDEBUG -DCLANG_ENABLE_ARCMT -DCLANG_ENABLE_REWRITER 
-DCLANG_ENABLE_STATIC_ANALYZER -fno-strict-aliasing 
-DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd10.3\" 
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd10.3\" -DDEFAULT_SYSROOT=\"\" 
-Qunused-arguments -fstack-protector  -fno-exceptions -fno-rtti 
-Wno-c++11-extensions  -c 
/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmipo/../../../contrib/llvm/lib/Transforms/IPO/StripDeadPrototypes.cpp
 -o StripDeadPrototypes.o
--- all_subdir_libllvmcodegen ---
--- MachineVerifier.o ---
c++   -O2 -pipe 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmcodegen/../../../contrib/llvm/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmcodegen/../../../contrib/llvm/tools/clang/include
 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmcodegen/../../../contrib/llvm/lib/CodeGen
 -I. 
-I/builds/workspace/FreeBSD_stable_10/src/lib/clang/libllvmcodegen/../../../contrib/llvm/../../lib/clang/include
 -DLLVM_ON_UNIX 

Poor ZFS ARC metadata hit/miss stats after recent ZFS updates

2016-10-17 Thread Fabian Keil
After rebasing some of my systems from r305866 to r307312
(plus local patches) I noticed that most of the ARC accesses
are counted as misses now.

Example:

[fk@elektrobier2 ~]$ uptime
 2:03PM  up 1 day, 18:36, 7 users, load averages: 0.29, 0.36, 0.30
[fk@elektrobier2 ~]$ zfs-stats -E


ZFS Subsystem ReportMon Oct 17 14:03:58 2016


ARC Efficiency: 3.38m
Cache Hit Ratio:12.87%  435.23k
Cache Miss Ratio:   87.13%  2.95m
Actual Hit Ratio:   9.55%   323.15k

Data Demand Efficiency: 6.61%   863.01k

CACHE HITS BY CACHE LIST:
  Most Recently Used:   18.97%  82.54k
  Most Frequently Used: 55.28%  240.60k
  Most Recently Used Ghost: 8.88%   38.63k
  Most Frequently Used Ghost:   24.84%  108.12k

CACHE HITS BY DATA TYPE:
  Demand Data:  13.10%  57.03k
  Prefetch Data:0.00%   0
  Demand Metadata:  32.94%  143.36k
  Prefetch Metadata:53.96%  234.85k

CACHE MISSES BY DATA TYPE:
  Demand Data:  27.35%  805.98k
  Prefetch Data:0.00%   0
  Demand Metadata:  71.21%  2.10m
  Prefetch Metadata:1.44%   42.48k



I suspect that this is caused by r307265 ("MFC r305323: MFV r302991:
6950 ARC should cache compressed data") which removed a
ARCSTAT_CONDSTAT() call but I haven't confirmed this yet.

The system performance doesn't actually seem to be negatively affected
and repeated metadata accesses that are counted as misses are still served
from memory. On my freshly booted laptop I get:

fk@t520 /usr/ports $for i in 1 2 3; do \
 /usr/local/etc/munin/plugins/zfs-absolute-arc-hits-and-misses; \
 time git status > /dev/null; \
 done; \
 /usr/local/etc/munin/plugins/zfs-absolute-arc-hits-and-misses;
zfs_arc_hits.value 5758
zfs_arc_misses.value 275416
zfs_arc_demand_metadata_hits.value 4331
zfs_arc_demand_metadata_misses.value 270252
zfs_arc_demand_data_hits.value 304
zfs_arc_demand_data_misses.value 3345
zfs_arc_prefetch_metadata_hits.value 1103
zfs_arc_prefetch_metadata_misses.value 1489
zfs_arc_prefetch_data_hits.value 20
zfs_arc_prefetch_data_misses.value 334

real1m23.398s
user0m0.974s
sys 0m12.273s
zfs_arc_hits.value 11346
zfs_arc_misses.value 389748
zfs_arc_demand_metadata_hits.value 7723
zfs_arc_demand_metadata_misses.value 381018
zfs_arc_demand_data_hits.value 400
zfs_arc_demand_data_misses.value 3412
zfs_arc_prefetch_metadata_hits.value 3202
zfs_arc_prefetch_metadata_misses.value 4885
zfs_arc_prefetch_data_hits.value 21
zfs_arc_prefetch_data_misses.value 437

real0m1.472s
user0m0.452s
sys 0m1.820s
zfs_arc_hits.value 11348
zfs_arc_misses.value 428536
zfs_arc_demand_metadata_hits.value 7723
zfs_arc_demand_metadata_misses.value 419782
zfs_arc_demand_data_hits.value 400
zfs_arc_demand_data_misses.value 3436
zfs_arc_prefetch_metadata_hits.value 3204
zfs_arc_prefetch_metadata_misses.value 4885
zfs_arc_prefetch_data_hits.value 21
zfs_arc_prefetch_data_misses.value 437

real0m1.537s
user0m0.461s
sys 0m1.860s
zfs_arc_hits.value 11352
zfs_arc_misses.value 467334
zfs_arc_demand_metadata_hits.value 7723
zfs_arc_demand_metadata_misses.value 458556
zfs_arc_demand_data_hits.value 400
zfs_arc_demand_data_misses.value 3460
zfs_arc_prefetch_metadata_hits.value 3208
zfs_arc_prefetch_metadata_misses.value 4885
zfs_arc_prefetch_data_hits.value 21
zfs_arc_prefetch_data_misses.value 437

Disabling ARC compression through vfs.zfs.compressed_arc_enabled
does not affect the accounting issue.

Can anybody reproduce this?

Fabian


pgpVFcIp4qm9F.pgp
Description: OpenPGP digital signature


Re: I'm upset about FreeBSD

2016-10-17 Thread Rostislav Krasny
On Mon, Oct 17, 2016 at 3:31 PM, krad  wrote:
>
> Does this just affect MBR layouts? If possible you might want to consider
> UEFI booting for both windows and other os's, It's probably safer as you
> dont need to plays with partitions and bootloaders.

This is an old computer that doesn't support UEFI booting. In a past I
tried older FreeBSD versions on it and I don't remember any boot issue
with them. At least up to 9.X versions.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: I'm upset about FreeBSD

2016-10-17 Thread krad
Does this just affect MBR layouts? If possible you might want to consider
UEFI booting for both windows and other os's, It's probably safer as you
dont need to plays with partitions and bootloaders.

On 17 October 2016 at 13:11, Rostislav Krasny  wrote:

> On 17.10.2016 11:57:16 +0500, Eugene M. Zheganin wrote:
> > Hi.
> >
> > On 17.10.2016 5:44, Rostislav Krasny wrote:
> > > Hi,
> > >
> > > I've been using FreeBSD for many years. Not as my main operating
> > > > system, though. But anyway several bugs and patches were contributed
> > > and somebody even added my name into the additional contributors list.
> > > That's pleasing but today I tried to install the FreeBSD 11.0 and I'm
> > > > upset about this operating system.
> > >
> > > First of all I faced an old problem that I reported here a year ago:
> > > http://comments.gmane.org/gmane.os.freebsd.stable/96598
> > > > Completely new USB flash drive flashed by the
> > > FreeBSD-11.0-RELEASE-i386-mini-memstick.img file kills every Windows
> > > again. If I use the Rufus util to write the img file (using DD mode)
> > > the Windows dies immediately after the flashing. If I use the
> > > Win32DiskImager (suggested by the Handbook) it doesn't reinitialize
> > > the USB storage and Windows dies only if I remove and put that USB
> > > flash drive again or boot Windows when it is connected. Nothing was
> > > done to fix this nasty bug for a year.
> >
> > I saw this particular bug, and I must say - man, it's not FreeBSD, it's
> Rufus.
> > So far Windows doesn't have any decent tool to write the image with. As
> > about Rufus - somehow it does produce broken images on a USB stick
> > (not always though), which make every Windows installation to BSOD
> > immediately after inserting. And this continues until you reinitialize
> the
> > stick boot area.
>
> The DD mode in Rufus and in any other flashing util that supports the
> DD mode (including the native dd(1) program) just writes the image
> file byte by byte without any change, isn't it? If you say the boot
> area re-initialization resolves the BSOD issue then why the boot area
> isn't fixed in the image file at the first place? I agree that Windows
> has a serious bug but why FreeBSD doesn't try to workaround it? After
> all people try to install FreeBSD and if the Windows bug and the
> FreeBSD developers stubbornness prevent them to do so they just can't
> and won't install FreeBSD. This is a lose-lose situation.
>
> > P.S. By the way, win32diskimager is a total mess too. Sometimes it just
> > does nothing instead of writing an image. I did try almost all of the
> free
> >  win32 tools to write image with, and didn't find any that would
> completely
> > satisfy me. Rufus would be the best, if it didn't have this ridiculous
> bug with
> > BSOD.
>
> Did you try the native FreeBSD dd(1) program or a MinGW version of dd(1)?
>
> And what about other issues I've described in my first email? I
> managed to install FreeBSD 11.0 but it still unbootable.
>
>
> P.S. please Cc your replies to me since I don't receive this mailing
> list emails directly, although I'm subscribed.
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: I'm upset about FreeBSD

2016-10-17 Thread Rostislav Krasny
On 17.10.2016 11:57:16 +0500, Eugene M. Zheganin wrote:
> Hi.
>
> On 17.10.2016 5:44, Rostislav Krasny wrote:
> > Hi,
> >
> > I've been using FreeBSD for many years. Not as my main operating
> > > system, though. But anyway several bugs and patches were contributed
> > and somebody even added my name into the additional contributors list.
> > That's pleasing but today I tried to install the FreeBSD 11.0 and I'm
> > > upset about this operating system.
> >
> > First of all I faced an old problem that I reported here a year ago:
> > http://comments.gmane.org/gmane.os.freebsd.stable/96598
> > > Completely new USB flash drive flashed by the
> > FreeBSD-11.0-RELEASE-i386-mini-memstick.img file kills every Windows
> > again. If I use the Rufus util to write the img file (using DD mode)
> > the Windows dies immediately after the flashing. If I use the
> > Win32DiskImager (suggested by the Handbook) it doesn't reinitialize
> > the USB storage and Windows dies only if I remove and put that USB
> > flash drive again or boot Windows when it is connected. Nothing was
> > done to fix this nasty bug for a year.
>
> I saw this particular bug, and I must say - man, it's not FreeBSD, it's Rufus.
> So far Windows doesn't have any decent tool to write the image with. As
> about Rufus - somehow it does produce broken images on a USB stick
> (not always though), which make every Windows installation to BSOD
> immediately after inserting. And this continues until you reinitialize the
> stick boot area.

The DD mode in Rufus and in any other flashing util that supports the
DD mode (including the native dd(1) program) just writes the image
file byte by byte without any change, isn't it? If you say the boot
area re-initialization resolves the BSOD issue then why the boot area
isn't fixed in the image file at the first place? I agree that Windows
has a serious bug but why FreeBSD doesn't try to workaround it? After
all people try to install FreeBSD and if the Windows bug and the
FreeBSD developers stubbornness prevent them to do so they just can't
and won't install FreeBSD. This is a lose-lose situation.

> P.S. By the way, win32diskimager is a total mess too. Sometimes it just
> does nothing instead of writing an image. I did try almost all of the free
>  win32 tools to write image with, and didn't find any that would completely
> satisfy me. Rufus would be the best, if it didn't have this ridiculous bug 
> with
> BSOD.

Did you try the native FreeBSD dd(1) program or a MinGW version of dd(1)?

And what about other issues I've described in my first email? I
managed to install FreeBSD 11.0 but it still unbootable.


P.S. please Cc your replies to me since I don't receive this mailing
list emails directly, although I'm subscribed.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: I'm upset about FreeBSD

2016-10-17 Thread Borja Marcos

> On 17 Oct 2016, at 02:44, Rostislav Krasny  wrote:
> 
> Hi,
> 
> First of all I faced an old problem that I reported here a year ago:
> http://comments.gmane.org/gmane.os.freebsd.stable/96598
> Completely new USB flash drive flashed by the
> FreeBSD-11.0-RELEASE-i386-mini-memstick.img file kills every Windows
> again. If I use the Rufus util to write the img file (using DD mode)
> the Windows dies immediately after the flashing. If I use the
> Win32DiskImager (suggested by the Handbook) it doesn't reinitialize
> the USB storage and Windows dies only if I remove and put that USB
> flash drive again or boot Windows when it is connected. Nothing was
> done to fix this nasty bug for a year.

I’m afraid that´s a Windows problem. And potentially a critical one. That 
barfing
upon USB insertion might point to a buffer overflow condition.

Now that people from Microsoft are reading these lists and polishing support for
the Microsoft hypervisor, maybe they should slap some wrists in-house (hard!).
*nudge*-*nudge*-*wink*-*wink*.





Borja.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: I'm upset about FreeBSD

2016-10-17 Thread Eugene M. Zheganin

Hi.

On 17.10.2016 5:44, Rostislav Krasny wrote:

Hi,

I've been using FreeBSD for many years. Not as my main operating
system, though. But anyway several bugs and patches were contributed
and somebody even added my name into the additional contributors list.
That's pleasing but today I tried to install the FreeBSD 11.0 and I'm
upset about this operating system.

First of all I faced an old problem that I reported here a year ago:
http://comments.gmane.org/gmane.os.freebsd.stable/96598
Completely new USB flash drive flashed by the
FreeBSD-11.0-RELEASE-i386-mini-memstick.img file kills every Windows
again. If I use the Rufus util to write the img file (using DD mode)
the Windows dies immediately after the flashing. If I use the
Win32DiskImager (suggested by the Handbook) it doesn't reinitialize
the USB storage and Windows dies only if I remove and put that USB
flash drive again or boot Windows when it is connected. Nothing was
done to fix this nasty bug for a year.


I saw this particular bug, and I must say - man, it's not FreeBSD, it's Rufus. 
So far Windows doesn't have any decent tool to write the image with. As about 
Rufus - somehow it does produce broken images on a USB stick (not always 
though), which make every Windows installation to BSOD immediately after 
inserting. And this continues until you reinitialize the stick boot area. My 
opinion on this wasn't changing regardless of the operation system: if 
something traps after something valid happens (like USB flash is inserted) - 
that's OS problem, not the problem of whoever triggered this. Especially in the 
case when the USB flash is inserted, containing no FS that Windows can 
recognize and mount ouf-of-the-box. A non-bugged OS just shouldn't trap 
whatever is inserted in it's USB port, because it feels like in the cheap The 
Net movie with Sandra Bullock.

FreeBSD has many problems (and I'm upset with it too), but this just isn't it. 
Just because such thing never happens when you create the image using dd on 
just any OS that has it natively. So it's bad experience with Rufus, not with 
FreeBSD.

P.S. By the way, win32diskimager is a total mess too. Sometimes it just does 
nothing instead of writing an image. I did try almost all of the free win32 
tools to write image with, and didn't find any that would completely satisfy 
me. Rufus would be the best, if it didn't have this ridiculous bug with BSOD.

Eugene.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"