Re: vfs.zfs.min_auto_ashift and OpenZFS

2020-05-01 Thread Steven Hartland

Looks like it should still be there if your using the in tree ZFS:
https://svnweb.freebsd.org/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c?revision=358333=markup#l144

On 01/05/2020 19:03, Graham Perrin wrote:

In my sysctl.conf:

vfs.zfs.min_auto_ashift=12

– if I recall correctly, the line was written automatically when I 
installed FreeBSD-CURRENT a year or so ago.


With OpenZFS enabled:

root@momh167-gjp4-8570p:~ # sysctl vfs.zfs.min_auto_ashift
sysctl: unknown oid 'vfs.zfs.min_auto_ashift'

Should I have a different line in sysctl.conf, or is 
vfs.zfs.min_auto_ashift not required with OpenZFS?


Hardware: HP EliteBook 8570p, circa 2013
 


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: The future of ZFS in FreeBSD

2018-12-20 Thread Steven Hartland



On 20/12/2018 11:03, Bob Bishop wrote:

Hi,


On 19 Dec 2018, at 23:16, Matthew Macy  wrote:

On Wed, Dec 19, 2018 at 15:11 Steven Hartland 
wrote:


Sorry been off for a few weeks so must have missed that, please do prod me
on again if you don’t see any response to anything not just this. Like many
others I get so may emails across so many lists it’s more than likely I
just missed it.

That said would you say that with the right support we can make progress
on the this prior to the port? I have to ask as the alternative version has
been on the cusp for many years now so it’s feels more like a distant
memory than something that may happen, no disrespect to anyone involved, as
I know all too well how hard it can be to get something like this over the
line, especially when people have competing priorities.


I am hoping that it's sufficiently important to FreeBSD ZFS developers that
they'll give the PR the attention it needs so that it can be merged before
summer. My understanding is that it's mostly suffered from neglect. TRIM is
most important to FreeBSD and it already had its own implementation.

https://github.com/zfsonlinux/zfs/pull/5925

Please correct me if I’m wrong but this looks a lot less mature than FreeBSD’s 
existing TRIM support for ZFS which we’ve had in production for six years.

What is the rationale here? I’m concerned that it looks like an opportunity for 
mighty regressions.

This is the case, but overall this solution is thought to be a better 
approach.


With anything like this there is always a risk, so we all need a 
concerted effort to get to one solution.


    Regards
    Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: The future of ZFS in FreeBSD

2018-12-19 Thread Steven Hartland
Sorry been off for a few weeks so must have missed that, please do prod me
on again if you don’t see any response to anything not just this. Like many
others I get so may emails across so many lists it’s more than likely I
just missed it.

That said would you say that with the right support we can make progress on
the this prior to the port? I have to ask as the alternative version has
been on the cusp for many years now so it’s feels more like a distant
memory than something that may happen, no disrespect to anyone involved, as
I know all too well how hard it can be to get something like this over the
line, especially when people have competing priorities.


On Wed, 19 Dec 2018 at 22:52, Matthew Macy  wrote:

>
>
> On Wed, Dec 19, 2018 at 14:47 Steven Hartland 
> wrote:
>
>> Thanks for the write up most appreciated. One of the more meaty
>> differences is that FreeBSD ZFS still has the only merged and production
>> ready TRIM support so my question would be are their any plans to address
>> this before creating the new port as going back to a world without TRIM
>> support wouldn’t be something I’d look forward to.
>>
>
> Well, then please follow up on the request I CC'd you on a week ago asking
> that you engage on the deadlist based TRIM  PR. That's a better forum for
> discussing these details than lamenting in public lists.
>
> Thanks.
>
> -M
>
>
>
>
>> On Wed, 19 Dec 2018 at 06:51, Matthew Macy  wrote:
>>
>>> The sources for FreeBSD's ZFS support are currently taken directly
>>> from Illumos with local ifdefs to support the peculiarities of FreeBSD
>>> where the Solaris Portability Layer (SPL) shims fall short. FreeBSD
>>> has regularly pulled changes from Illumos and tried to push back any
>>> bug fixes and new features done in the context of FreeBSD. In the past
>>> few years the vast majority of new development in ZFS has taken place
>>> in DelphixOS and zfsonlinux (ZoL). Earlier this year Delphix announced
>>> that they will be moving to ZoL
>>> https://www.delphix.com/blog/kickoff-future-eko-2018 This shift means
>>> that there will be little to no net new development of Illumos. While
>>> working through the git history of ZoL I have also discovered that
>>> many races and locking bugs have been fixed in ZoL and never made it
>>> back to Illumos and thus FreeBSD. This state of affairs has led to a
>>> general agreement among the stakeholders that I have spoken to that it
>>> makes sense to rebase FreeBSD's ZFS on ZoL. Brian Behlendorf
>>> has graciously encouraged me to add FreeBSD support directly to ZoL
>>> https://github.com/zfsonfreebsd/ZoF so that we might all have a single
>>> shared code base.
>>>
>>> A port for ZoF can be found at https://github.com/miwi-fbsd/zof-port
>>> Before it can be committed some additional functionality needs to be
>>> added to the FreeBSD opencrypto framework. These can be found at
>>> https://reviews.freebsd.org/D18520
>>>
>>> This port will provide FreeBSD users with multi modifier protection,
>>> project quotas, encrypted datasets, allocation classes, vectorized
>>> raidz, vectorized checksums, and various command line improvements.
>>>
>>> Before ZoF can be merged back in to ZoL several steps need to be taken:
>>> - Integrate FreeBSD support into ZoL CI
>>> - Have most of the ZFS test suite passing
>>> - Complete additional QA testing at iX
>>>
>>> We at iX Systems need to port ZoL's EC2 CI scripts to work with
>>> FreeBSD and make sure that most of the ZFS Test Suite (ZTS) passes.
>>> Being integrated in to their CI will mean that, among other things,
>>> most integration issues will be caught before a PR is merged upstream
>>> rather than many months later when it is MFVed into FreeBSD. I’m
>>> hoping to submit the PR to ZoL some time in January.
>>>
>>> This port will make it easy for end users on a range of releases to
>>> run the latest version of ZFS. Nonetheless, transitioning away from an
>>> Illumos based ZFS is not likely to be entirely seamless. The
>>> stakeholders I’ve spoken to all agree that this is the best path
>>> forward but some degree of effort needs to be made to accommodate
>>> downstream consumers. The current plan is to import ZoF and unhook the
>>> older Illumos based sources from the build on April 15th or two months
>>> after iX systems QA deems ZoF stable - which ever comes later. The
>>> Illumos based sources will be removed some time later - but well
>>> before 13. This will 

Re: The future of ZFS in FreeBSD

2018-12-19 Thread Steven Hartland
Thanks for the write up most appreciated. One of the more meaty differences
is that FreeBSD ZFS still has the only merged and production ready TRIM
support so my question would be are their any plans to address this before
creating the new port as going back to a world without TRIM support
wouldn’t be something I’d look forward to.

On Wed, 19 Dec 2018 at 06:51, Matthew Macy  wrote:

> The sources for FreeBSD's ZFS support are currently taken directly
> from Illumos with local ifdefs to support the peculiarities of FreeBSD
> where the Solaris Portability Layer (SPL) shims fall short. FreeBSD
> has regularly pulled changes from Illumos and tried to push back any
> bug fixes and new features done in the context of FreeBSD. In the past
> few years the vast majority of new development in ZFS has taken place
> in DelphixOS and zfsonlinux (ZoL). Earlier this year Delphix announced
> that they will be moving to ZoL
> https://www.delphix.com/blog/kickoff-future-eko-2018 This shift means
> that there will be little to no net new development of Illumos. While
> working through the git history of ZoL I have also discovered that
> many races and locking bugs have been fixed in ZoL and never made it
> back to Illumos and thus FreeBSD. This state of affairs has led to a
> general agreement among the stakeholders that I have spoken to that it
> makes sense to rebase FreeBSD's ZFS on ZoL. Brian Behlendorf
> has graciously encouraged me to add FreeBSD support directly to ZoL
> https://github.com/zfsonfreebsd/ZoF so that we might all have a single
> shared code base.
>
> A port for ZoF can be found at https://github.com/miwi-fbsd/zof-port
> Before it can be committed some additional functionality needs to be
> added to the FreeBSD opencrypto framework. These can be found at
> https://reviews.freebsd.org/D18520
>
> This port will provide FreeBSD users with multi modifier protection,
> project quotas, encrypted datasets, allocation classes, vectorized
> raidz, vectorized checksums, and various command line improvements.
>
> Before ZoF can be merged back in to ZoL several steps need to be taken:
> - Integrate FreeBSD support into ZoL CI
> - Have most of the ZFS test suite passing
> - Complete additional QA testing at iX
>
> We at iX Systems need to port ZoL's EC2 CI scripts to work with
> FreeBSD and make sure that most of the ZFS Test Suite (ZTS) passes.
> Being integrated in to their CI will mean that, among other things,
> most integration issues will be caught before a PR is merged upstream
> rather than many months later when it is MFVed into FreeBSD. I’m
> hoping to submit the PR to ZoL some time in January.
>
> This port will make it easy for end users on a range of releases to
> run the latest version of ZFS. Nonetheless, transitioning away from an
> Illumos based ZFS is not likely to be entirely seamless. The
> stakeholders I’ve spoken to all agree that this is the best path
> forward but some degree of effort needs to be made to accommodate
> downstream consumers. The current plan is to import ZoF and unhook the
> older Illumos based sources from the build on April 15th or two months
> after iX systems QA deems ZoF stable - which ever comes later. The
> Illumos based sources will be removed some time later - but well
> before 13. This will give users a 3 month period during which both the
> port and legacy Illumos based ZFS will be available to users. Pools
> should interoperate between ZoF and legacy provided the user does not
> enable any features available only in ZoF. We will try to accommodate
> any downstream consumers in the event that they need that date pushed
> back. We ask that any downstream consumers who are particularly
> sensitive to changes start testing the port when it is formally
> announced and report back any issues they have. I will do my best to
> ensure that this message is communicated to all those who it may
> concern. However, I can’t ensure that everyone reads these lists. That
> is the responsibility of -CURRENT users.
>
> -M
> ___
> freebsd...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFS: alignment/boundary for partition type freebsd-zfs

2017-12-26 Thread Steven Hartland
Yes it does know how to figure out based on stripe size

On Tue, 26 Dec 2017 at 20:25, O. Hartmann  wrote:

> Am Tue, 26 Dec 2017 09:31:53 -0800 (PST)
> "Rodney W. Grimes"  schrieb:
>
> > > On Tue, Dec 26, 2017 at 10:04 AM, O. Hartmann 
> > > wrote:
> > >
> > > > Am Tue, 26 Dec 2017 11:44:29 -0500
> > > > Allan Jude  schrieb:
> > > >
> > > > > On 2017-12-26 11:24, O. Hartmann wrote:
> > > > > > Running recent CURRENT on most of our lab's boxes, I was in need
> to
> > > > replace and
> > > > > > restore a ZFS RAIDZ pool. Doing so, I was in need to partition
> the
> > > > disks I was about
> > > > > > to replace. Well, the drives in question are 4k block size
> drives with
> > > > 512b emulation
> > > > > > - as most of them today. I've created the only and sole partiton
> on
> > > > each 4 TB drive
> > > > > > via the command sequence
> > > > > >
> > > > > > gpart create -s GPT adaX
> > > > > > gpart add -t freebsd-zfs -a 4k -l nameXX adaX
> > > > > >
> > > > > > After doing this on all drives I was about to replace, something
> drove
> > > > me to check on
> > > > > > the net and I found a lot of websites giving "advices", how to
> prepare
> > > > large, modern
> > > > > > drives for ZFS. I think the GNOP trick is not necessary any
> more, but
> > > > many blogs
> > > > > > recommend to perform
> > > > > >
> > > > > > gpart add -t freebsd-zfs -b 1m -a 4k -l nameXX adaX
> > > > > >
> > > > > > to put the partition boundary at the 1 Megabytes boundary. I
> didn't do
> > > > that. My
> > > > > > partitions all start now at block 40.
> > > > > >
> > > > > > My question is: will this have severe performance consequences
> or is
> > > > that negligible?
> > > > > >
> > > > > > Since most of those websites I found via "zfs freebsd
> alignement" are
> > > > from years ago,
> > > > > > I'm a bit confused now an consideration performing all this
> > > > days-taking resilvering
> > > > > > process let me loose some more hair as the usual "fallout" ...
> > > > > >
> > > > > > Thanks in advance,
> > > > > >
> > > > > > Oliver
> > > > > >
> > > > >
> > > > > The 1mb alignment is not required. It is just what I do to leave
> room
> > > > > for the other partition types before the ZFS partition.
> > > > >
> > > > > However, the replacement for the GNOP hack, is separate. In
> addition to
> > > > > aligning the partitions to 4k, you have to tell ZFS that the drive
> is 4k:
> > > > >
> > > > > sysctl vfs.zfs.min_auto_ashift=12
> > > > >
> > > > > (2^12 = 4096)
> > > > >
> > > > > Before you create the pool, or add additional vdevs.
> > > > >
> > > >
> > > > I didn't do the sysctl vfs.zfs.min_auto_ashift=12 :-(( when I
> created the
> > > > vdev. What is
> > > > the consequence for that for the pool? I lived under the impression
> that
> > > > this is necessary
> > > > for "native 4k" drives.
> > > >
> > > > How can I check what ashift is in effect for a specific vdev?
> > > >
> > >
> > > It's only necessary if your drive stupidly fails to report its physical
> > > sector size correctly, and no other FreeBSD developer has already
> written a
> > > quirk for that drive.  Do "zdb -l /dev/adaXXXpY" for any one of the
> > > partitions in the ZFS raid group in question.  It should print either
> > > "ashift: 12" or "ashift: 9".
> >
> > And more than likely if you used the bsdinstall from one of
> > the distributions to setup the system you created the ZFS
> > pool from it has the sysctl in /boot/loader.conf as the
> > default for all? recent?  bsdinstall's is that the 4k default
> > is used and the sysctl gets written to /boot/loader.conf
> > at install time so from then on all pools you create shall
> > also be 4k.   You have to change a default during the
> > system install to change this to 512.
>
>
> I never used any installation scripts so far.
>
> Before I replaced the pool's drives, I tried to search for informations on
> how-to. This
> important tiny fact must have slipped through - or it is very bad
> documented. I didn't
> find a hint in tuning(7), which is the man page I consulted first.
>
> Luckily, as Allan Jude stated, the disk recognition was correct (I guess
> stripesize
> instead of blocksize is taken?).
>
> >
> > > -aLAn
> >
> > > ___
> > > freebsd-current@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > > To unsubscribe, send any mail to "
> freebsd-current-unsubscr...@freebsd.org"
> > >
> >
>
>
>
> --
> O. Hartmann
>
> Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
> Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFS: alignment/boundary for partition type freebsd-zfs

2017-12-26 Thread Steven Hartland
You only need to set the min if the drives hide their true sector size, as
Allan mentioned.
 camcontrol identify  is one of the easiest ways to check this.

If the pool reports ashift 12 then zfs correctly detected the drives as 4K
so that part is good

On Tue, 26 Dec 2017 at 20:15, Rodney W. Grimes <
freebsd-...@pdx.rh.cn85.dnsmgr.net> wrote:

> > On Tue, Dec 26, 2017 at 10:04 AM, O. Hartmann 
> > wrote:
> >
> > > Am Tue, 26 Dec 2017 11:44:29 -0500
> > > Allan Jude  schrieb:
> > >
> > > > On 2017-12-26 11:24, O. Hartmann wrote:
> > > > > Running recent CURRENT on most of our lab's boxes, I was in need to
> > > replace and
> > > > > restore a ZFS RAIDZ pool. Doing so, I was in need to partition the
> > > disks I was about
> > > > > to replace. Well, the drives in question are 4k block size drives
> with
> > > 512b emulation
> > > > > - as most of them today. I've created the only and sole partiton on
> > > each 4 TB drive
> > > > > via the command sequence
> > > > >
> > > > > gpart create -s GPT adaX
> > > > > gpart add -t freebsd-zfs -a 4k -l nameXX adaX
> > > > >
> > > > > After doing this on all drives I was about to replace, something
> drove
> > > me to check on
> > > > > the net and I found a lot of websites giving "advices", how to
> prepare
> > > large, modern
> > > > > drives for ZFS. I think the GNOP trick is not necessary any more,
> but
> > > many blogs
> > > > > recommend to perform
> > > > >
> > > > > gpart add -t freebsd-zfs -b 1m -a 4k -l nameXX adaX
> > > > >
> > > > > to put the partition boundary at the 1 Megabytes boundary. I
> didn't do
> > > that. My
> > > > > partitions all start now at block 40.
> > > > >
> > > > > My question is: will this have severe performance consequences or
> is
> > > that negligible?
> > > > >
> > > > > Since most of those websites I found via "zfs freebsd alignement"
> are
> > > from years ago,
> > > > > I'm a bit confused now an consideration performing all this
> > > days-taking resilvering
> > > > > process let me loose some more hair as the usual "fallout" ...
> > > > >
> > > > > Thanks in advance,
> > > > >
> > > > > Oliver
> > > > >
> > > >
> > > > The 1mb alignment is not required. It is just what I do to leave room
> > > > for the other partition types before the ZFS partition.
> > > >
> > > > However, the replacement for the GNOP hack, is separate. In addition
> to
> > > > aligning the partitions to 4k, you have to tell ZFS that the drive
> is 4k:
> > > >
> > > > sysctl vfs.zfs.min_auto_ashift=12
> > > >
> > > > (2^12 = 4096)
> > > >
> > > > Before you create the pool, or add additional vdevs.
> > > >
> > >
> > > I didn't do the sysctl vfs.zfs.min_auto_ashift=12 :-(( when I created
> the
> > > vdev. What is
> > > the consequence for that for the pool? I lived under the impression
> that
> > > this is necessary
> > > for "native 4k" drives.
> > >
> > > How can I check what ashift is in effect for a specific vdev?
> > >
> >
> > It's only necessary if your drive stupidly fails to report its physical
> > sector size correctly, and no other FreeBSD developer has already
> written a
> > quirk for that drive.  Do "zdb -l /dev/adaXXXpY" for any one of the
> > partitions in the ZFS raid group in question.  It should print either
> > "ashift: 12" or "ashift: 9".
>
> And more than likely if you used the bsdinstall from one of
> the distributions to setup the system you created the ZFS
> pool from it has the sysctl in /boot/loader.conf as the
> default for all? recent?  bsdinstall's is that the 4k default
> is used and the sysctl gets written to /boot/loader.conf
> at install time so from then on all pools you create shall
> also be 4k.   You have to change a default during the
> system install to change this to 512.
>
> > -aLAn
>
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "
> freebsd-current-unsubscr...@freebsd.org"
> >
>
> --
> Rod Grimes
> rgri...@freebsd.org
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r319971 -> r320351: Fatal error 'Cannot allocate red zone for initial thread'

2017-06-26 Thread Steven Hartland

Is this related to kib's additional fix over the weekend?

https://svnweb.freebsd.org/changeset/base/320344

Regards
Steve

On 26/06/2017 09:29, O. Hartmann wrote:

Over the past week we did not update several 12-CURRENT running development
hosts, so today is the first day of performing this task.

First I hit the very same problem David Wolfskill reported earlier, a fatal
trap 12, but fowllowing the thread, I did as advised: removing /usr/obj
completely (we use filemon/WITH_META_MODE=YES all over the place) and
recompiling world and kernel.

Since tag 20170617 in /usr/src/UPDATING referred to the INO64 update and the
INO64 update hasn't performed so far starting from r319971, I installed the
kernel, rebooted the box in single user mode (this time smoothly), did a
mergemaster and tried to do "make installworld" - but the box instantanously
bails out:

[...]
Fatal error 'Cannot allocate red zone for initial thread' at line 392 in
file /usr/src/lib/libthread/thr_init.c
pid 60 (cc) uid0: exited on signal 6 ...

[...]

That way, I obviously can not install a world :-(

What is wrong here? Is the problem resovable?

Kind regards,

Oliver
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Are textmode consoles/terminals still supported?

2017-03-20 Thread Steven Hartland

Add the following to /boot/loader.conf

Its a tunable but not a sysctl so you can't query it, you just need to 
set it by adding it to /boot/loader.conf:

hw.vga.textmode="1"

On 20/03/2017 21:58, Chris H wrote:

I'm attempting to get a video card that DTRT on FreeBSD.
I started with the graphics provided by an AMD A6-7470K,
only to discover it's not yet supported. So I forked out
for a recent nvidia card, and build/installed a new
world/kernel.
Everything seemed to be as one would expect, except there
was an issue with loader.efi. So I had to move mine aside,
and use the one off the install media (tho I understand
the (u)efi has since been fixed). Now, I'm attempting to
obtain textmode. The text stripped from a tty, and pasted
to a new file in a textmode editor -- ee(1) for example;
pads the line with spaces to EOL, and prefaces each line
following the first line with rubbish (about 1 to 2
characters worth).
So "graphics mode" or vt(4) isn't going to get it, for me.
Textmode, and syscons(4) has always worked as expected, and
I thought I'd try to re-enable it, or get textmode via vt(4).
But all attempts fail.
excerpt from my KERNCONF

device  vga
options VESA

device  sc
options SC_PIXEL_MODE

device  vt
device  vt_vga
device  vt_efifb

However, following the advice on the freebsd wiki, querying
the value in sysctl(8) returns:
# sysctl hw.vga.textmode
sysctl: unknown oid 'hw.vga.textmode'

OK how bout vidcontrol(1)
# vidcontrol -i adapter
vidcontrol: obtaining adapter information: Inappropriate ioctl for device

So, it appears from my standpoint that textmode is no longer
supported?

FWIW:
FreeBSD trump.whitehouse.gov.test 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314700:
Sun Mar 5 09:01:30 PST 2017
r...@trump.whitehouse.gov.test:/usr/obj/usr/src/sys/TESTKERN amd6

Thank you for anything that might help me obtain textmode again.

--Chris


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: gptzfsboot trouble

2017-02-07 Thread Steven Hartland


On 07/02/2017 16:33, Toomas Soome wrote:



On 7. veebr 2017, at 18:08, Thomas Sparrevohn 
 wrote:

Hi all



Last week I decided to upgrade my FreeBSD installation - it's been a while
(September 16 was last time). Unfortunately CURRENT does not boot and cash
in a weird way. Both 11-RELEASE and 12 CURRENT boot loader seems to attempt
to read blocks that exceeds the physical disk. Initially I through it was a
hard disk error - but after a "oh" experience I realised that the
"gptzfsboot: error 4 lba 921592" is actually beyond the physical boundaries
of the disk (300GB disk). In order to rule out different options - I
installed a vanilla 11-RELEASE on the 300G with a simple stripe - it also
gives the error but does boot - the LBA of the error is slightly different
on 11 CURRENT and comes up with LBA 921600



I have scanned all the disks for physical faults and there seems to be none
and I have tried doing a single disk installation on each disk - they give
the same error - Does anybody have any idea? Included Photos as sometimes it
get through to the actual boot menu but then crash in another place



The gptzfsboot does read the backup label from the disk and the GPT backup 
label is stored at the end of the disk. The location of the backup label is in 
the primary GPT table, alternate sector field. I wonder if that location is 
somehow set to bad value…


Booting from a live CD and inspecting the label with gpart show should 
be able to confirm that.


Regards
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: recent change to vim defaults?

2017-01-15 Thread Steven Hartland

On 15/01/2017 17:45, Pete Wright wrote:



On 1/15/17 8:22 AM, Kyle Evans wrote:

On Jan 15, 2017 10:03, "Julian Elischer"  wrote:

I noticed that suddenly vim is grabbing mouse movements, which makes 
life

really hard.

Was there a specific revision that brought in this change, and can it be
removed?




Yea I can second this - IIRC it looks like around Sept or Oct vim now 
defaults to enabling Visual Mode.  I've been setting this in my 
~/.vimrc to disable it - but not enabling Visual Mode by default would 
awesome for me:


set mouse-=a

Yer we hatted this change too.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-06 Thread Steven Hartland
Hmm I'm not sure about everyone else but I we treat emX as legacy 
devices (not used one in years) but igbX is very common here.


The impact of changing a nic device name is quite a bit more involved 
than just rc.conf it effects other areas too, jails etc so given we can 
loose access to the machine on reboot if everything isn't done right, it 
would be worth considering:


 * Change emX -> igbX to lower the impact.
 * Add shims / alias so that operations on the device name going away
   still work.

What do people think?


On 06/01/2017 03:17, Sean Bruno wrote:

tl;dr --> igbX devices will become emX devices

We're about to commit an update to sys/dev/e1000 that will implement and
activate IFLIB for em(4), lem(4) & igb(4) and would appreciate all folks
who can test and poke at the drivers to do so this week.  This will have
some really great changes for performance and standardization that have
been bouncing around inside of various FreeBSD shops that have been
collaborating with Matt Macy over the last year.

This will implement multiple queues for certain em(4) devices that are
capable of such things and add some new sysctl's for you to poke at in
your monitoring tools.

Due to limitations of device registration, igbX devices will become emX
devices.  So, you'll need to make a minor update to your rc.conf and
scripts that manipulate the network devices.

UPDATING will be bumped to reflect these changes.

MFC to stable/11 will have a legacy implementation that doesn't use
IFLIB for compatibility reasons.

A documentation and man page update will follow in the next few days
explaining how to work with the changed driver.

sean

bcc net@ current@ re@





___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Destroy GPT partition scheme absolutely, how?

2016-10-04 Thread Steven Hartland
CAM already does this, doesn't really help with speed as much as you 
might think though.


On 04/10/2016 19:53, Maxim Sobolev wrote:

For the whole disk destruction, hopefully one day we'd have BIO_DELETE
coalesce code, so that you can batch of lot of operations into handful SATA
commands. I've heard rumours imp@ was doing something along those lines. As
well as SSD disks smart enough to process those requests in the background.
Anyway, just saying. )



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: commits between r305191 - r305211 broke zfs list

2016-09-03 Thread Steven Hartland

Out of sync kernel / world?

Do you have a crash dump?

On 03/09/2016 21:50, Subbsd wrote:

Hi.

Can anybody test of output for:

zfs list

command on FreeBSD current after r305211 ? On my hosts his leads to
zfs segfault.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 11.0-BETA1 make installworldworld fails

2016-07-18 Thread Steven Hartland
When ever I've seen installworld issues after a full buildworld its been 
clock related, so might want to check the clock on the machine is right.


On 17/07/2016 20:58, Kim Culhan wrote:

Seeing this on FreeBSD 11.0-BETA1 #0 r302963M

After make buildworld completes with no problem, then rebooted in
single-user mode

in /usr/src:

make installworld

--

Installing everything

--
cd /usr/src; make -f Makefile.inc1 install
===> lib (install)
===> lib/csu (install)
===> lib/csu/amd64 (install)
cc -target x86_64-unknown-freebsd11.0 --sysroot=/usr/obj/usr/src/tmp
-B/usr/obj/usr/src/tmp/usr/bin -O2 -pipe
-I/usr/src/lib/csu/amd64/../common
-I/usr/src/lib/csu/amd64/../../libc/include -fno-omit-frame-pointer
-std=gnu99  -Wsystem-headers -Werror -Wall -Wno-format-y2k -W
-Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes
-Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow
-Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs
-Wredundant-decls -Wold-style-definition -Wno-pointer-sign -Wthread-safety
-Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable
-Qunused-arguments  ERROR-tried-to-rebuild-during-make-install -S -o crt1.s
/usr/src/lib/csu/amd64/crt1.c
*** Error code 127
Stop.
make[6]: stopped in /usr/src/lib/csu/amd64
*** Error code 1

Hope this helps.

thanks
-kim
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-11.0-BETA1-amd64-disc1.iso is too big for my 700MB CD-r

2016-07-12 Thread Steven Hartland

On 12/07/2016 22:20, Ed Schouten wrote:

2016-07-11 23:01 GMT+02:00 Ronald Klop :

Just downloaded the amd64 BETA1 ISO (873MB) and tried to burn a CD on
Windows 10. It complained that the ISO is too big for my 700 MB CD-r.

I remember back in the days we also had a 'miniinst' CD, which was
identical to 'bootonly', but at least contained the install sets to
get a minimal system working. What ever happened to that?

Since we found mfsbsd , we've never looked back it 
does just that + a one cmd line install.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-11.0-BETA1-amd64-disc1.iso is too big for my 700MB CD-r

2016-07-12 Thread Steven Hartland

On 12/07/2016 21:50, Slawa Olhovchenkov wrote:

On Tue, Jul 12, 2016 at 01:39:34PM -0700, Conrad Meyer wrote:


Maybe Tier 2 can deal with just bootonly.iso.  Or your machines should
be dropped from Tier 2 if they don't support USB and we aren't okay
with dropping disc1 support for all of Tier 2.

There's lots of aging hardware we don't support in modern FreeBSD,
including alpha and ia64.  USB is 20 years young at this point.

Not all BIOS can be boot from USB.
I am have Fujitsu notebook not support USB boot.

From a USB Pen drive I can understand but from a USB DVD Drive that 
would be some seriously antiquated hardware!


Regards
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-11.0-BETA1-amd64-disc1.iso is too big for my 700MB CD-r

2016-07-12 Thread Steven Hartland

On 12/07/2016 20:52, Mark Linimon wrote:

On Tue, Jul 12, 2016 at 04:09:10PM +0930, Shane Ambler wrote:

+1 on dropping CD images.

I have 24U of things that don't have DVD players, including some tier-2
machines for which no upgrade is available.


Any no USB?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-11.0-BETA1-amd64-disc1.iso is too big for my 700MB CD-r

2016-07-11 Thread Steven Hartland



On 11/07/2016 23:39, Allan Jude wrote:

On 2016-07-11 18:33, Chris H wrote:
On Tue, 12 Jul 2016 00:46:04 +0300 Slawa Olhovchenkov 
 wrote



On Mon, Jul 11, 2016 at 09:41:44PM +, Glen Barber wrote:


On Mon, Jul 11, 2016 at 03:32:34PM -0600, Alan Somers wrote:

On Mon, Jul 11, 2016 at 2:01 PM, Ronald Klop 

wrote: >> Hi,


Just downloaded the amd64 BETA1 ISO (873MB) and tried to burn a 
CD on
Windows 10. It complained that the ISO is too big for my 700 MB 
CD-r.


The bootonly iso (281MB) burns and runs ok.

Regards,
Ronald.


Please open a PR.  Those images should be able to fit on a CD.


This was actually a known "going to be problem" thing for 11.0.  I'm
looking into how to fix this for 11.0-RELEASE, but right now, there is
not much more we can exclude from it. :(

Can't it use the compressed iso format, or is it already using that
format. Sorry haven't checked.


Reduce GENERIC to MINIMAL?


--Chris


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"




380MB of the data on disc1 is the distsets, which are already .txz 
(max compression). That doesn't leave much room for the live OS on the 
disk.



Silly question but what about only supporting DVD?

I can't remember the last server I installed that had CDROM drive vs a 
DVD drive.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Setting sysctl vfs.zfs.arc_max failed: 22

2016-07-08 Thread Steven Hartland
Thanks for testing Nathan, that is the expected behaviour it might be 
nice if we had the concept of a sysctl which is at its auto value and 
hence we could use that to determine if we should recalculate said 
automatic values which hadn't been manually set but we don't have that 
unfortunately.


On 08/07/2016 07:48, Nathan Bosley wrote:

I was just testing this a bit.
I can now set max in loader.conf as expected.

I did notice one thing that I thought was a bit strange though.

As a reference, here are my defaults without any ARC tunables/sysctls:

vfs.zfs.arc_meta_limit: 3903459328
vfs.zfs.arc_min: 1951729664
vfs.zfs.arc_max: 15613837312

If I put vfs.zfs.arc_max="8589934592" in loader.conf, the results are:

vfs.zfs.arc_meta_limit: 2147483648
vfs.zfs.arc_min: 1073741824
vfs.zfs.arc_max: 8589934592

So meta_limit and min are also changed, which is reasonable.

If I remove all of my ARC tunables in loader.conf, so that I have the 
default values after booting, and then use:

# sysctl vfs.zfs.arc_max="8589934592"

The result is:

vfs.zfs.arc_meta_limit: 2147483648
vfs.zfs.arc_min: 1951729664
vfs.zfs.arc_max: 8589934592

Max was set as requested.
meta_limit was set to max/4.
But min is still at the default.

In other words, if I use loader.conf to set max, then min and 
meta_limit are also recalculated.
But if I use sysctl to set max, only meta_limit is recalculated; min 
remains at the default.

I'm not sure if that's the intent.

Just a heads-up.

Thanks again.

On Wed, Jul 6, 2016 at 7:51 PM, Steven Hartland 
<kill...@multiplay.co.uk <mailto:kill...@multiplay.co.uk>> wrote:


On 06/07/2016 21:39, Eric van Gyzen wrote:

    On 07/06/16 03:35 PM, Steven Hartland wrote:

The ARC settings and kmem aren't initialised when tunables
are loaded
so the tests fail.

I've fixed this locally by blindly setting if ARC is not
configured.
Request to commit the fix is with re@

In the mean time the patch is attached.

Thanks for the report and sorry about the breakage.

No worries.  Thanks for the quick fix.

https://svnweb.freebsd.org/changeset/base/302382

___
freebsd-current@freebsd.org <mailto:freebsd-current@freebsd.org>
mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to
"freebsd-current-unsubscr...@freebsd.org
<mailto:freebsd-current-unsubscr...@freebsd.org>"




___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Setting sysctl vfs.zfs.arc_max failed: 22

2016-07-06 Thread Steven Hartland

On 06/07/2016 21:39, Eric van Gyzen wrote:

On 07/06/16 03:35 PM, Steven Hartland wrote:

The ARC settings and kmem aren't initialised when tunables are loaded
so the tests fail.

I've fixed this locally by blindly setting if ARC is not configured.
Request to commit the fix is with re@

In the mean time the patch is attached.

Thanks for the report and sorry about the breakage.

No worries.  Thanks for the quick fix.


https://svnweb.freebsd.org/changeset/base/302382
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Setting sysctl vfs.zfs.arc_max failed: 22

2016-07-06 Thread Steven Hartland
The ARC settings and kmem aren't initialised when tunables are loaded so 
the tests fail.


I've fixed this locally by blindly setting if ARC is not configured. 
Request to commit the fix is with re@


In the mean time the patch is attached.

Thanks for the report and sorry about the breakage.

On 06/07/2016 07:20, Nathan Bosley wrote:

Maybe I misunderstood after all.
I took this:
"You can work around it temporarily by setting a lower arc_min first."

To mean that I could do something like:
vfs.zfs.arc_min="1073741824"
vfs.zfs.arc_max="8589934592"

in loader.conf, which would circumvent the problem.
But with the above, in that order, I still get:

Setting sysctl vfs.zfs.arc_max failed: 22
Setting sysctl vfs.zfs.arc_min failed: 22

As an FYI, WITHOUT any tunables set, my defaults are:
vfs.zfs.arc_meta_limit: 3903459328
vfs.zfs.arc_min: 1951729664
vfs.zfs.arc_max: 15613837312

So even if I only specified:
vfs.zfs.arc_max="8589934592"

in loader.conf, that's still not below my default min.

However, if I make the changes with 'sysctl' after boot, it works fine:

root@athlonbsd:~ # sysctl vfs.zfs.arc_min="1073741824"
vfs.zfs.arc_min: 1951729664 -> 1073741824

root@athlonbsd:~ # sysctl vfs.zfs.arc_max="8589934592"

vfs.zfs.arc_max: 15613837312 -> 8589934592

They also work fine sysctl.conf.

Sorry if this is a silly question:
I should still be able to set these max/min values in loader.conf,
right--not just in sysctl.conf?

Thanks.


On Tue, Jul 5, 2016 at 10:16 PM, Nathan Bosley <nathan.bos...@gmail.com>
wrote:


OK, I follow you now.
Thanks for the explanation.
I will try that later tonight or tomorrow.

On Tue, Jul 5, 2016 at 9:45 PM, Allan Jude <allanj...@freebsd.org> wrote:


On 2016-07-05 21:32, Nathan Bosley wrote:

I think in about 4 - 5 hours I can show what values I'm using in
loader.conf under, say, r302264 and r302265 for comparison. I'm not 100%
sure that the problem arose for me in r302265; I merely suspect it.

On Tue, Jul 5, 2016 at 9:25 PM, Allan Jude <allanj...@freebsd.org
<mailto:allanj...@freebsd.org>> wrote:

 On 2016-07-05 20:27, Steven Hartland wrote:
 > Ahh right, let me check that.
 >
 > On 06/07/2016 00:51, Nathan Bosley wrote:
 >> I actually have this same problem.
 >> I'll send more details when I get home later.
 >>
 >> I think the problem started for me after r302265.
 >> Before that, I can set vfs.zfs.arc_max and vfs.zfs.arc_min in
 >> loader.conf.
 >> After r302265, setting either vfs.zfs.arc_max or vfs.zfs.arc_min

in

 >> loader.conf results in the EINVAL errors in 'dmesg':
 >>
 >> Setting sysctl vfs.zfs.arc_max failed: 22
 >> Setting sysctl vfs.zfs.arc_min failed: 22
 >>
 >> But setting vfs.zfs.arc_meta_limit in loader.conf works fine.
 >>
 >> But I did notice that using 'sysct' or sysctl.conf for

vfs.zfs.arc_max

 >> and vfs.zfs.arc_min works.
 >> I only have problems with setting them now in loader.conf.
 >>
 >> Like I said, I'll try to send output from my setup later.
 >>
 >> Thanks.
 >>
 >> On Tue, Jul 5, 2016 at 6:10 PM, Steven Hartland
 >> <ste...@multiplay.co.uk <mailto:ste...@multiplay.co.uk>
 <mailto:ste...@multiplay.co.uk <mailto:ste...@multiplay.co.uk>>>

wrote:

 >>
 >> What is it currently?
 >>
 >> Just had a quick play here:
 >> sysctl vfs.zfs.arc_max
 >> vfs.zfs.arc_max: 32283127808
 >> sysctl vfs.zfs.arc_max=32283127807
 >> vfs.zfs.arc_max: 32283127808 -> 32283127807
 >> sysctl vfs.zfs.arc_max=32283127808
 >> vfs.zfs.arc_max: 32283127807 -> 32283127808
 >>
 >> Error 22 = EINVAL so I suspect you're requesting a value
 which one
 >> of the following:
 >> * < arc_abs_min
 >> * > kmem_size
 >> * < arc_c_min
 >> * < zfs_arc_meta_limit
 >>
 >> Regards
 >> Steve
 >>
 >> On 05/07/2016 22:56, Eric van Gyzen wrote:
 >>
 >> Steven and -current:
 >>
 >> I just updated to r302350 with a GENERIC kernel config.
 I see
 >> this in
 >> dmesg:
 >>
 >>  VT(efifb): resolution 1024x768
 >>  Setting sysctl vfs.zfs.arc_max failed: 22
 >>  CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
 >> (3491.98-MHz K8-class
 >>  CPU)
 >>
 >> The relevant parts of /boot/l

Re: Setting sysctl vfs.zfs.arc_max failed: 22

2016-07-06 Thread Steven Hartland

On 06/07/2016 02:25, Allan Jude wrote:

On 2016-07-05 20:27, Steven Hartland wrote:

Ahh right, let me check that.

On 06/07/2016 00:51, Nathan Bosley wrote:

I actually have this same problem.
I'll send more details when I get home later.

I think the problem started for me after r302265.
Before that, I can set vfs.zfs.arc_max and vfs.zfs.arc_min in
loader.conf.
After r302265, setting either vfs.zfs.arc_max or vfs.zfs.arc_min in
loader.conf results in the EINVAL errors in 'dmesg':

Setting sysctl vfs.zfs.arc_max failed: 22
Setting sysctl vfs.zfs.arc_min failed: 22

But setting vfs.zfs.arc_meta_limit in loader.conf works fine.

But I did notice that using 'sysct' or sysctl.conf for vfs.zfs.arc_max
and vfs.zfs.arc_min works.
I only have problems with setting them now in loader.conf.

Like I said, I'll try to send output from my setup later.

Thanks.

On Tue, Jul 5, 2016 at 6:10 PM, Steven Hartland
<ste...@multiplay.co.uk <mailto:ste...@multiplay.co.uk>> wrote:

 What is it currently?

 Just had a quick play here:
 sysctl vfs.zfs.arc_max
 vfs.zfs.arc_max: 32283127808
 sysctl vfs.zfs.arc_max=32283127807
 vfs.zfs.arc_max: 32283127808 -> 32283127807
 sysctl vfs.zfs.arc_max=32283127808
 vfs.zfs.arc_max: 32283127807 -> 32283127808

 Error 22 = EINVAL so I suspect you're requesting a value which one
 of the following:
 * < arc_abs_min
 * > kmem_size
 * < arc_c_min
 * < zfs_arc_meta_limit

 Regards
 Steve

 On 05/07/2016 22:56, Eric van Gyzen wrote:

 Steven and -current:

 I just updated to r302350 with a GENERIC kernel config.  I see
 this in
 dmesg:

  VT(efifb): resolution 1024x768
  Setting sysctl vfs.zfs.arc_max failed: 22
  CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
 (3491.98-MHz K8-class
  CPU)

 The relevant parts of /boot/loader.conf are:

  zfs_load="YES"
  vfs.zfs.arc_max="6442450944"

 Let me know what other information you need.

 Cheers,

 Eric



I think the issue might be that the default value of arc_min is higher
than when the user is trying to set arc_max to. In that case we might
want sysctl to lower arc_min instead of giving an error?

It would definitely be a POLA violation to have to set arc_min lower to
be able to have existing lines that set arc_max in loader.conf work
correctly.

I'm actually thinking its because the initial calculation hasn't 
occurred yet.


This is not apparent on 10 because the tunable and the sysctl are 
separate. I'm waiting for my head box to rebuild ATM and will check when 
that's done.


Regard
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Setting sysctl vfs.zfs.arc_max failed: 22

2016-07-05 Thread Steven Hartland

Ahh right, let me check that.

On 06/07/2016 00:51, Nathan Bosley wrote:

I actually have this same problem.
I'll send more details when I get home later.

I think the problem started for me after r302265.
Before that, I can set vfs.zfs.arc_max and vfs.zfs.arc_min in loader.conf.
After r302265, setting either vfs.zfs.arc_max or vfs.zfs.arc_min in 
loader.conf results in the EINVAL errors in 'dmesg':


Setting sysctl vfs.zfs.arc_max failed: 22
Setting sysctl vfs.zfs.arc_min failed: 22

But setting vfs.zfs.arc_meta_limit in loader.conf works fine.

But I did notice that using 'sysct' or sysctl.conf for vfs.zfs.arc_max 
and vfs.zfs.arc_min works.

I only have problems with setting them now in loader.conf.

Like I said, I'll try to send output from my setup later.

Thanks.

On Tue, Jul 5, 2016 at 6:10 PM, Steven Hartland 
<ste...@multiplay.co.uk <mailto:ste...@multiplay.co.uk>> wrote:


What is it currently?

Just had a quick play here:
sysctl vfs.zfs.arc_max
vfs.zfs.arc_max: 32283127808
sysctl vfs.zfs.arc_max=32283127807
vfs.zfs.arc_max: 32283127808 -> 32283127807
sysctl vfs.zfs.arc_max=32283127808
vfs.zfs.arc_max: 32283127807 -> 32283127808

Error 22 = EINVAL so I suspect you're requesting a value which one
of the following:
* < arc_abs_min
* > kmem_size
* < arc_c_min
* < zfs_arc_meta_limit

Regards
Steve

On 05/07/2016 22:56, Eric van Gyzen wrote:

Steven and -current:

I just updated to r302350 with a GENERIC kernel config.  I see
this in
dmesg:

 VT(efifb): resolution 1024x768
 Setting sysctl vfs.zfs.arc_max failed: 22
 CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
(3491.98-MHz K8-class
 CPU)

The relevant parts of /boot/loader.conf are:

 zfs_load="YES"
 vfs.zfs.arc_max="6442450944"

Let me know what other information you need.

Cheers,

Eric


___
freebsd-current@freebsd.org <mailto:freebsd-current@freebsd.org>
mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to
"freebsd-current-unsubscr...@freebsd.org
<mailto:freebsd-current-unsubscr...@freebsd.org>"




___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Setting sysctl vfs.zfs.arc_max failed: 22

2016-07-05 Thread Steven Hartland

What is it currently?

Just had a quick play here:
sysctl vfs.zfs.arc_max
vfs.zfs.arc_max: 32283127808
sysctl vfs.zfs.arc_max=32283127807
vfs.zfs.arc_max: 32283127808 -> 32283127807
sysctl vfs.zfs.arc_max=32283127808
vfs.zfs.arc_max: 32283127807 -> 32283127808

Error 22 = EINVAL so I suspect you're requesting a value which one of 
the following:

* < arc_abs_min
* > kmem_size
* < arc_c_min
* < zfs_arc_meta_limit

Regards
Steve
On 05/07/2016 22:56, Eric van Gyzen wrote:

Steven and -current:

I just updated to r302350 with a GENERIC kernel config.  I see this in
dmesg:

 VT(efifb): resolution 1024x768
 Setting sysctl vfs.zfs.arc_max failed: 22
 CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz (3491.98-MHz K8-class
 CPU)

The relevant parts of /boot/loader.conf are:

 zfs_load="YES"
 vfs.zfs.arc_max="6442450944"

Let me know what other information you need.

Cheers,

Eric



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ATA? related trouble with r300299

2016-05-24 Thread Steven Hartland



On 24/05/2016 21:56, Kenneth D. Merry wrote:

On Tue, May 24, 2016 at 23:54:09 +0300, Oleg V. Nauman wrote:

On Tuesday 24 May 2016 16:17:33 you wrote:

Okay, I've got a basic idea of what may be going on.  The resets that are
getting sent are triggering another probe, which then triggers a reset,
which triggers a probe...and so on.

So here is another patch that should work for you:

https://people.freebsd.org/~ken/cam_smr_ada_patch.20160524.2.txt

I have commented out the quirk for this drive, and the driver will now only
start the SMR probe on drives that claim to be SMR-capable.  So, for the
vast majority of drives out there right now, it won't even start the extra
probe steps.

  It fixes this issue. I was able to boot with your latest patch.

Great!  I'll check it in with that fix as well as a quirk entry.  That way,
if we have other reasons later on to issue a read log, we'll know that
it doesn't work for those drives.
Might be worth seeing if smartmontools can read the log from that drive 
before committing the quirk as a double check.


Regards
Steve

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Samsung SSD TRIM [Was: Heads up]

2016-04-15 Thread Steven Hartland

On 15/04/2016 09:11, Kurt Jaeger wrote:

Hi!

avg wrote:

For what it's worth, I have been using the following SSDs since September of 
2015 :

ada2:  ACS-2 ATA SATA 3.x device
ada3:  ACS-2 ATA SATA 3.x device

I have one in use with zfs and trim:

ada1:  ACS-2 ATA SATA 3.x device

Works as my ports build hosts, and is fine as far as I can see.

Prior to Warners commit there was no NCQ TRIM support in FreeBSD, so 
while it was working with standard non-NCQ TRIM (and I can corroborate 
that as we use the 840's and 850's all over with ZFS with TRIM enabled) 
its possible that it could cause issues when NCQ TRIM comes into play.


From what I read when this issue first came to light, I believe the 
actual issue was a Linux kernel bug not a FW bug in Samsung drives that 
caused the corruption. This is the thread which details said issue: 
http://www.spinics.net/lists/raid/msg49440.html


Regards
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Question about cam 4K quirks

2016-04-11 Thread Steven Hartland



On 11/04/2016 15:24, Tomoaki AOKI wrote:

Thanks for your answer!

On Sun, 10 Apr 2016 16:15:56 +0100
Steven Hartland <kill...@multiplay.co.uk> wrote:



On 10/04/2016 15:35, Tomoaki AOKI wrote:

On Sun, 10 Apr 2016 06:59:04 -0600
Alan Somers <asom...@freebsd.org> wrote:


On Sun, Apr 10, 2016 at 12:56 AM, Tomoaki AOKI <junch...@dec.sakura.ne.jp>
wrote:


Hi. Maybe freebsd-hardware list would be the right place, but it's not
so active. :-(

Is 4K quirks needed for every HDDs/SSDs having physical sector size
4096?

If so, I would be able to provide patch for Crucial M550 and MX200.
(Possibly covers other models [BX200 etc.] by abstraction.)

M550(1TB):  device model  Crucial CT1024M550SSD1
firmware revision MU01
MX200(1TB): device model  Crucial CT1024MX200SSD1
firmware revision MU03
  -> Abstracted with "Crucial CT*SSD*" or "Crucial CT*", as the part
 "1024" should vary with its capacity and can be 3 to 4 digits
 for now. I tried the former and confirmed "quirks=0x1<4K>"
 appears, which doesn't appear without adding the entry.


If not, is it sufficient if `camcontrol identify ` states
"physical 4096" on "sector size" line for everything in kernel and
related components (i.e., zfs-related ones)?


Regards.

You only need quirk entries if the device fails to identify its physical
size correctly.  If "camcontrol identify" states "physical 4096", then
you're probably ok, but it's not the best place to ask.  "camcontrol
identify" asks the device directly, whereas "diskinfo -v" asks the kernel.
If "diskinfo -v" says "4096 stripesize" then you're definitely ok.

-Alan

Thanks for clarification.

Tried "diskinfo -v" as you noted (of course running the kernel without
adding quirks entry) and confirmed it saying "4096 # stripesize".
So it's already OK with current ata_da.c and scsi_da.c (no quirks is
needed).

OTOH, trying with Samsung 850 evo (the last one I have for now,
having quirks entry in current source), "diskinfo -v" says "4096
# stripesize" while "camcontrol identify" says "physical 512".
This should be why quirks entries are needed (and implemented) for it.

Correct, manufactures took the cop out route and return 512 for both
logical and physical sizes to avoid issues with bad OS support.
SSD's a particularly lazy in this regard.

I think stripesize should be primarily for RAID configuration, but
after 4k physical sectored drives  (so called AFT drives) appears,
applied to even for single drive configuration, too. Right?

stripesize simply gives a hit as to performance when accessing the device.

So now FreeBSD's ZFS defaults ashift 12, if I remember correctly, to
align datasets with 4k.
ZFS calculates the most suitable given the reports physical and logical 
sector sizes, technically the default is ashift 9 unless altered by 
setting vfs.zfs.min_auto_ashift.

And UFS has minimum blocksize of 4k (defaults
8k). And more, now gpart can align partitions as root specifies.


If so, as writing blocks smaller than stripesize (except for the last
block of a file) is nonsense for RAID configuration, all write access
to HDDs/SSDs are constrained to use stripesize for minimum block size,
right?

Nope, sectorsize constrains that.

So possibly some filesystems can be mis-aligned even if the start point
is properly aligned.

  *Mis-aligned fragments should be allowed, though.


stripesize is only used as a way to help tune filesystem access patterns
e.g. in ZFS it is used to help determine the ashift value which in turn
determines the minimum allocatable block size. This helps optimise
performance while sacrificing storage space i.e. causing wastage.

Exactly. :-)
But there's large possibility of severe performance degradation caused
by mis-aligned blocks, especially in HDD, and the capacities of HDDs
became large, even in 2.5inch form factor. Defaulting block size to
physical sector size would be reasonable, if any option to downsize to
512 bytes is provided.

That would be a step back, correcting the offset would be the better fix.

  *If I remember correctly, block size of UFS is 4096 bytes at minimum,
   but supports 512 bytes fragments for small files (and to concentrate
   tail portions of large files). It would be in many cases reasonable
   trade-off, too.

I don't use UFS so couldn't comment.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Question about cam 4K quirks

2016-04-10 Thread Steven Hartland



On 10/04/2016 15:35, Tomoaki AOKI wrote:

On Sun, 10 Apr 2016 06:59:04 -0600
Alan Somers  wrote:


On Sun, Apr 10, 2016 at 12:56 AM, Tomoaki AOKI 
wrote:


Hi. Maybe freebsd-hardware list would be the right place, but it's not
so active. :-(

Is 4K quirks needed for every HDDs/SSDs having physical sector size
4096?

If so, I would be able to provide patch for Crucial M550 and MX200.
(Possibly covers other models [BX200 etc.] by abstraction.)

   M550(1TB):  device model  Crucial CT1024M550SSD1
   firmware revision MU01
   MX200(1TB): device model  Crucial CT1024MX200SSD1
   firmware revision MU03
 -> Abstracted with "Crucial CT*SSD*" or "Crucial CT*", as the part
"1024" should vary with its capacity and can be 3 to 4 digits
for now. I tried the former and confirmed "quirks=0x1<4K>"
appears, which doesn't appear without adding the entry.


If not, is it sufficient if `camcontrol identify ` states
"physical 4096" on "sector size" line for everything in kernel and
related components (i.e., zfs-related ones)?


Regards.


You only need quirk entries if the device fails to identify its physical
size correctly.  If "camcontrol identify" states "physical 4096", then
you're probably ok, but it's not the best place to ask.  "camcontrol
identify" asks the device directly, whereas "diskinfo -v" asks the kernel.
If "diskinfo -v" says "4096 stripesize" then you're definitely ok.

-Alan

Thanks for clarification.

Tried "diskinfo -v" as you noted (of course running the kernel without
adding quirks entry) and confirmed it saying "4096 # stripesize".
So it's already OK with current ata_da.c and scsi_da.c (no quirks is
needed).

OTOH, trying with Samsung 850 evo (the last one I have for now,
having quirks entry in current source), "diskinfo -v" says "4096
# stripesize" while "camcontrol identify" says "physical 512".
This should be why quirks entries are needed (and implemented) for it.
Correct, manufactures took the cop out route and return 512 for both 
logical and physical sizes to avoid issues with bad OS support.

SSD's a particularly lazy in this regard.

I think stripesize should be primarily for RAID configuration, but
after 4k physical sectored drives  (so called AFT drives) appears,
applied to even for single drive configuration, too. Right?

stripesize simply gives a hit as to performance when accessing the device.

If so, as writing blocks smaller than stripesize (except for the last
block of a file) is nonsense for RAID configuration, all write access
to HDDs/SSDs are constrained to use stripesize for minimum block size,
right?

Nope, sectorsize constrains that.

stripesize is only used as a way to help tune filesystem access patterns 
e.g. in ZFS it is used to help determine the ashift value which in turn 
determines the minimum allocatable block size. This helps optimise 
performance while sacrificing storage space i.e. causing wastage.


Regards
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Mixed ashift?

2016-03-31 Thread Steven Hartland

vfs.zfs.min_auto_ashift is only used when a device is added so you can set it 
add, then change.


On 31/03/2016 07:15, Allan Jude wrote:

On 2016-03-31 02:13, Dustin Marquess wrote:

I have what I think is a pretty normal setup.. a bunch of HDDs plus 2 SSDs
(one ZIL, one SLOG).

The HDDs are standard 512 byte sector drives.  The SSDs have 8k page sizes.

In Illumos I added the SSDs to sd.conf and created the zpool and it shows
the HDDs as ashift 9 and the SSDs as ashift 13, like normal:

# zdb -C | grep ashift
 ashift: 9
 ashift: 9
 ashift: 9
 ashift: 9
 ashift: 13

The question is, how to replicate this in FreeBSD?  The old "gnop" method
doesn't work anymore, and setting "vfs.zfs.min_auto_ashift=13" causes it to
use 13 for the HDDs, which seems like a waste.  Is this not supported?

Thanks!
-Dustin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


gnop should work, and you'd set the ashift before you add the devices.
So add the hard drives with it set to 9, then set it to 13 and add the SLOG



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: boot loaders got fatter in the last few days

2016-03-19 Thread Steven Hartland


On 18/03/2016 17:51, Guido Falsi wrote:

On 03/18/16 17:54, José Pérez wrote:

Hi Guido,
maybe it's because of this:
https://svnweb.freebsd.org/base?view=revision=296963


I see.

There is a problem with this though, we have howtos suggesting 64K for
the size of the freebsd-boot gpt partition:

https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/RAIDZ1

now that size isn't sufficient anymore. We should at least update these
information soon.

Also repartitioning could be problematic in certain scenarios. I think
this change should be at least published in UPDATING and maybe also in
the future release notes for 11.0.

Personally I'll find a way of reorganizing my disks to fit this change,
but it's something that could byte users.

Yes indeed I would suggest if we increase it we do by a decent amount 
e.g. jump straight to 1MB.


Regards
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: boot loaders got fatter in the last few days

2016-03-19 Thread Steven Hartland



On 18/03/2016 17:42, Nikolai Lifanov wrote:

On 03/18/16 13:03, Allan Jude wrote:

On 2016-03-18 12:33, Guido Falsi wrote:

Hi,

I have just update one of my machines and noticed the booloaders files
got quite fat in the last few days, some by a big margin.

on an updated machine(r296993):


ls -l /boot/*boot*

-r--r--r--  1 root  wheel8192 Mar 18 16:47 /boot/boot
-r--r--r--  1 root  wheel 512 Mar 18 16:47 /boot/boot0
-r--r--r--  1 root  wheel 512 Mar 18 16:47 /boot/boot0sio
-r--r--r--  1 root  wheel 512 Mar 18 16:47 /boot/boot1
-r-xr-xr-x  1 root  wheel   72152 Mar 18 16:47 /boot/boot1.efi
-r--r--r--  1 root  wheel  819200 Mar 18 16:47 /boot/boot1.efifat
-r--r--r--  1 root  wheel7680 Mar 18 16:47 /boot/boot2
-r--r--r--  1 root  wheel1185 Mar 18 16:47 /boot/cdboot
-r--r--r--  1 root  wheel   85794 Mar 18 16:47 /boot/gptboot
-r--r--r--  1 root  wheel  110546 Mar 18 16:47 /boot/gptzfsboot
-r--r--r--  1 root  wheel  358400 Mar 18 16:47 /boot/pxeboot
-r--r--r--  1 root  wheel  341248 Mar 18 16:47 /boot/userboot.so
-r--r--r--  1 root  wheel   66048 Mar 18 16:47 /boot/zfsboot

from a machine I still have not updated(r296719):


ls -l /boot/*boot*

-r--r--r--  1 root  wheel8192 Mar 13 21:01 /boot/boot
-r--r--r--  1 root  wheel 512 Mar 13 21:01 /boot/boot0
-r--r--r--  1 root  wheel 512 Mar 13 21:01 /boot/boot0sio
-r--r--r--  1 root  wheel 512 Mar 13 21:01 /boot/boot1
-r-xr-xr-x  1 root  wheel   72152 Mar 13 21:01 /boot/boot1.efi
-r--r--r--  1 root  wheel  819200 Mar 13 21:01 /boot/boot1.efifat
-r--r--r--  1 root  wheel7680 Mar 13 21:01 /boot/boot2
-r--r--r--  1 root  wheel1185 Mar 13 21:01 /boot/cdboot
-r--r--r--  1 root  wheel   16059 Mar 13 21:01 /boot/gptboot
-r--r--r--  1 root  wheel   41511 Mar 13 21:01 /boot/gptzfsboot
-r--r--r--  1 root  wheel  288768 Mar 13 21:01 /boot/pxeboot
-r--r--r--  1 root  wheel  341208 Mar 13 21:01 /boot/userboot.so
-r--r--r--  1 root  wheel   66048 Mar 13 21:01 /boot/zfsboot

I noticed because mu gpt boot partition is 64K and gptzfsboot just
passed 100K.

Is this expected and I'm supposed to repartition or is this an unwanted
mistake?

Thanks in advance.


This is a side effect of the loader gaining the ability to boot from
GELI encrypted partitions.

You can compile with LOADER_NO_GELI_SUPPORT to disable this to get back
to a smaller one if you need.

Maybe we should be putting the GELI enabled boot blocks in a different
filename? I generally wanted to avoid creating a new version of each
bootcode with GELI support.

My goal somewhere down the road is to create a single bootcode that can
do UFS and ZFS, then maybe we can have gptboot and gptgeliboot or
something.



Maybe a single gptbootlite for minimum viable case of UFS+nothing fancy?
At some point in the near future users that want additional features
will re-partition and bsdinstall will create larger partitions for boot
and this won't be a problem.

P.S.: Allan, do you plan to enable GELI support for boot1.efi?

Makes it harder to use more features, so I would vote don't do this, 
keep a single boot image its bad enough we have separate zfs ufs loaders 
already.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Crashes in libthr?

2016-03-14 Thread Steven Hartland



On 14/03/2016 22:28, Larry Rosenman wrote:

On 2016-03-14 17:25, Poul-Henning Kamp wrote:


In message <2016031428.ga1...@borg.lerctr.org>, Larry Rosenman 
writes:



And sshd is busted.


FYI: I seeing no such issues on two systems running:

11.0-CURRENT #4 r296808: Sun Mar 13 22:39:59 UTC 2016
and
11.0-CURRENT #32 r296137: Sat Feb 27 11:34:01 UTC 2016
As I said it's this ONE box, even doing an install from the other 
(RUNNING) boxes

/usr/src,/usr/obj).

This build was at:
borg.lerctr.org /usr/src $ svn info
Path: .
Working Copy Root Path: /usr/src
URL: https://svn.freebsd.org/base/head
Relative URL: ^/head
Repository Root: https://svn.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 296823
Node Kind: directory
Schedule: normal
Last Changed Author: adrian
Last Changed Rev: 296823
Last Changed Date: 2016-03-13 23:39:35 -0500 (Sun, 13 Mar 2016)

borg.lerctr.org /usr/src $


I can post the make.conf.

It's really weird.

Silly question your not building on an NFS FS are you?

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: loader and load: path?

2016-03-07 Thread Steven Hartland
So your saying "kldload zfs" fails because it can't find opensolaris or 
are you giving it absolute paths?


If so try without absolute paths.

Regards
Steve

On 07/03/2016 14:09, Larry Rosenman wrote:
If I type load /boot/kernel/kernel, and then load /boot/kernel/zfs.ko, 
the loader
(loader.efi in this case) says it can't find opensolaris.  Same for 
dtraceall, where
it can't find it's dependent modules.  Just hitting enter does the 
right thing.


What variable(s) am I missing?




___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sr-iov issues, reset_hw() failed with error -100

2016-02-22 Thread Steven Hartland

isn't that 48 cores (12 real 12 virtual) per CPU?

On 22/02/2016 21:26, Ultima wrote:

This system has 24 cores (e5-2670v3)x2

Ultima

On Mon, Feb 22, 2016 at 3:53 PM, Pieper, Jeffrey E <
jeffrey.e.pie...@intel.com> wrote:


Just out of curiosity, how many cores does your system have?

Jeff

-Original Message-
From: owner-freebsd-curr...@freebsd.org [mailto:
owner-freebsd-curr...@freebsd.org] On Behalf Of Ultima
Sent: Monday, February 22, 2016 12:02 PM
To: Eric Joyner 
Cc: freebsd-current@freebsd.org; freebsd-virtualizat...@freebsd.org
Subject: Re: sr-iov issues, reset_hw() failed with error -100

After reboot...

ifconfig ix1 up

dhclient ix1
DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 4
DHCPOFFER from 192.168.1.1
DHCPREQUEST on ix1 to 255.255.255.255 port 67
DHCPACK from 192.168.1.1
bound to 192.168.1.145 -- renewal in 21600 seconds.

ix0 down
ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
64 bytes from 192.168.1.1: icmp_seq=0 ttl=64 time=0.149 ms
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.171 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.167 ms

iovctl -Cf /etc/iovctl.conf

ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
^C
--- 192.168.1.1 ping statistics ---
29 packets transmitted, 0 packets received, 100.0% packet loss
ifconfig ix1 up
ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
^C
--- 192.168.1.1 ping statistics ---
12 packets transmitted, 0 packets received, 100.0% packet loss

ix1 is no longer usable until a restart...

iovctl -Dd ix1
ifconfig ix1 up
ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
^C
--- 192.168.1.1 ping statistics ---
9 packets transmitted, 0 packets received, 100.0% packet loss



Is there anything else that maybe useful?

here is my ifconfig at the end (after ifconfig ix0 up)


ix0: flags=8943 metric 0
mtu 1500

options=e400b9
ether -Hidden-
inet 192.168.1.8 netmask 0xff00 broadcast 192.168.1.255
inet 192.168.1.9 netmask 0xff00 broadcast 192.168.1.255
nd6 options=29
media: Ethernet autoselect (10Gbase-T )
status: active
ix1: flags=8843 metric 0 mtu 1500

options=e407bb
ether -Hidden-
inet 192.168.1.145 netmask 0xff00 broadcast 192.168.1.255
nd6 options=29
media: Ethernet autoselect (10Gbase-T )
status: active
lo0: flags=8049 metric 0 mtu 16384
options=63
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
inet 127.0.0.1 netmask 0xff00
nd6 options=21
groups: lo
bridge0: flags=8843 metric 0 mtu
1500
ether -Hidden-
nd6 options=9
groups: bridge
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: ix0 flags=143
ifmaxaddr 0 port 1 priority 128 path cost 2000
member: epair0a flags=143
ifmaxaddr 0 port 5 priority 128 path cost 2000
epair0a: flags=8943 metric
0 mtu 1500
options=8
ether -Hidden-
inet6 fe80::ff:70ff:fe00:50a%epair0a prefixlen 64 scopeid 0x5
nd6 options=21
media: Ethernet 10Gbase-T (10Gbase-T )
status: active
groups: epair

On Mon, Feb 22, 2016 at 1:51 PM, Eric Joyner  wrote:


Did you do an ifconfig up on ix1 before loading the VF driver?

On Sat, Feb 20, 2016 at 11:57 AM Ultima  wrote:


  Decided to do some testing with iovctl to see how sr-iov is coming

along.

Currently when adding the vf's there are a couple errors, and the

network

no longer function after iovctl is started. My guess is the reset_hw()
call
that is failing. Any ideas why this call would fail? I tested this on

both

ports, ix1 is detached and unused for this test, however inserting a

cable

results in an unusable port. iovctl -Dd ix1 removes the vf's, however
functionality is still not restored without a system restart.

FreeBSD S1 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r295736: Wed Feb 17
21:17:28 EST 2016 root@S1:/usr/obj/usr/src/sys/MYKERNEL  amd64

/boot/loader.conf
hw.ix.num_queues="4"

/etc/iovctl.conf
PF {
 device : ix1;
 num_vfs : 31;
}

DEFAULT {
 passthrough : true;
}
VF-0 {
 passthrough : false;
}
VF-1 {
 passthrough : false;
}

# iovctl -C -f /etc/iovctl.conf

dmesg
ixv0: 

Re: ZFSROOT UEFI boot

2016-02-05 Thread Steven Hartland
Yep thanks, I committed this to head today, after Warner confirmed it
was good in his env too.

It will sit there for 1 week before I request permission to MFC to
stable/10.

On 05/02/2016 17:24, Tomoaki AOKI wrote:
> I got a feedback from Yuichiro NAITO at freebsd-users-jp ML.
>
> Boots fine using memstick created using release/release.sh on patched
> head. (Diff9) Properly shows up installer screen.
>
> The system is Macbook Air 2012 having root-on-UFS 10.2-RELEASE without
> ZFS. I think feedbacks using different firmware/platform would be
> valuable.
>

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFSROOT UEFI boot

2016-02-01 Thread Steven Hartland

Hmm, this doesn't look right:-
1) Boot from ada0 WITHOUT USB memstick and ada1 attached (single drive).

boot1 imagepath: pciroot(0x0):pci(0x1f,0x02):sata(0x1,0x0,0x0):hd(1)
probing: pciroot(0x0):pci(0x1f,0x02):sata(0x0,0x0,0x0)
probe: . not supported
probing: pciroot(0x0):pci(0x1f,0x02):sata(0x1,0x0,0x0)
probe: . not supported
probing: pciroot(0x0):pci(0x1f,0x02):sata(0x0,0x0,0x0):hd(1)
probe: . not supported
probing: pciroot(0x0):pci(0x1f,0x02):sata(0x0,0x0,0x0):hd(2)
probe: . not supported
probing: pciroot(0x0):pci(0x1f,0x02):sata(0x0,0x0,0x0):hd(3)
probe: . not supported
probing: pciroot(0x0):pci(0x1f,0x02):sata(0x0,0x0,0x0):hd(4)
probe: + supported
probing: pciroot(0x0):pci(0x1f,0x02):sata(0x0,0x0,0x0):hd(5)
probe: + supported

This is saying you booted from ada0 
"pciroot(0x0):pci(0x1f,0x02):sata(0x1,0x0,0x0):hd(1)" but then the 
device list didn't find that but it did find:
"pciroot(0x0):pci(0x1f,0x02):sata(0x0,0x0,0x0):hd(1)" so sata(0x1,... vs 
sata(0x0, could you double check this result please?


This is re-enforced by the fact the test 2 booted from ada0 "boot1 
imagepath: pciroot(0x0):pci(0x1f,0x02):sata(0x0,0x0,0x0):hd(1)".


The missing "supported (preferred)" even when there looks like there 
should be match is something I need to check.


Regards
Steve

It says t

On 01/02/2016 14:36, Tomoaki AOKI wrote:

Thanks in advance.
But unfortunately, the boot behavior of Diff7 and Diff8 are changed
from Diff6. Back to old problematic behavior.

FYI, I re-tested Diff6 (previously built binary) and reproduced the
behavior I already reported. (So no new logs for it.)

Please see attached 2 files (for Diff7 and Diff8 respectively).
In addition to test 2) and 3), 5) [USB boot] for Diff7 and 1) [single
drive] for Diff8 are done.

Regards.


On Sun, 31 Jan 2016 13:58:23 +
Steven Hartland <kill...@multiplay.co.uk> wrote:


Thanks for doing that it was very helpful, and I know transcribing from
video would have been quite a time consuming task.

I noticed a few interesting facts:
1. It looks like when you boot from ada0 and ada1 its still picking the
same device (according to device order).
Its not 100% clear as your devices are sata which Diff 6 didn't have
decoding for. I've added that now so hopefully we can confirm, also
added output of boot1 imgpath: so we can see what the EFI thinks the
boot device is.
2. Your usb device path has two message path entries which means
msg_paths_match would result in a false positive for usb devices, this
is now fixed by matching until we see a media path.

If you can now re-test the following two cases:
2) Boot from ada0 without USB memstick attached (2 drives).
3) Boot from ada1 without USB memstick attached (2 drives).

I'm interested to confirm two things:
1. If the boot1 imgpath lines do indeed vary
2. If the "load: '/boot/loader.efi'" line devpath matches the boot1 imgpath.

The changes from Diff 7 shouldn't effect the outcome of the other tests,
but confirming that too wouldn't hurt but no need for the output.

  Regards
  Steve
On 31/01/2016 11:43, Tomoaki AOKI wrote:

I found Diff6 you uploaded to PHABRICATOR. So my report below is based
on it.

The patched boot1.efi runs as expected (== as you wrote) for me.
*boot1.efi without -DEFI_DEBUG isn't tested. Needed?

As I mentioned in my previous post, I have no serial console
environment. So I took movie of limited tests and typed it.

Which of the tests are typed:

1) Boot from ada0 removing ada1, no USB memstick attached, ZFS in
   ada0 has loader.efi. (Single drive config.)

2) Boot from ada0, no USB memstick attached, ZFS in ada0 has
   loader.efi.

3) Boot from ada1, no USB memstick attached, ZFS in ada1 has
   loader.efi.

4) Boot from ada1, no USB memstick attached, only UFS in ada0 has
   loader.efi.

5) Boot from da0 (USB memstick with memstick.img of head).
   UFS in da0 has loader.efi and ZFS isn't present in da0.
   (3 drives [2 drives + USB memstick] config.)

Please see attached text for detail. Are more outputs needed?

Thanks in advance! I'm looking forward to see this MFC'ed before
releng/10.3 is branched. (Relies on imp@'s test?)

   *Will need to be in conjunction with changes after r294265 in head, as
currently Diff6 doesn't apply to stable/10.

Regards.


On Sat, 30 Jan 2016 19:12:34 +
Steven Hartland <kill...@multiplay.co.uk> wrote:


I believe, based on testing, that the from Diff 5 onwards of
https://reviews.freebsd.org/D5108 this should work as you expect it i.e.

If boot1 is loaded from a device which has either a UFS or ZFS bootable
install then this is the device that will be used to boot.

If said device has both then the ZFS setup will still be tried first.

If you can test in your setup and confirm either way that would be most
appreciated.

   Regards
   Steve

On 30/01/2016 06:57, Tomoaki AOKI wrote:

Thanks for your quick supp

Re: ZFSROOT UEFI boot

2016-02-01 Thread Steven Hartland
Ok thanks, I hoped as much, otherwise we where looking a very broken EFI 
firmware ;-)


I've found and fixed the match issue, so if you could re-test 2 and 3 we 
should be able confirm all is good.


If all is good a confirmation that there's no issues with the rest, but 
no need for output, needed.


Regards
Steve

On 01/02/2016 16:19, Tomoaki AOKI wrote:

Woops! Found mistype in Diff8 report. Sorry. :-(
Attached is the fixed one. [imagepath line of 1) is fixed.]


On Mon, 1 Feb 2016 23:36:27 +0900
Tomoaki AOKI <junch...@dec.sakura.ne.jp> wrote:


Thanks in advance.
But unfortunately, the boot behavior of Diff7 and Diff8 are changed
from Diff6. Back to old problematic behavior.

FYI, I re-tested Diff6 (previously built binary) and reproduced the
behavior I already reported. (So no new logs for it.)

Please see attached 2 files (for Diff7 and Diff8 respectively).
In addition to test 2) and 3), 5) [USB boot] for Diff7 and 1) [single
drive] for Diff8 are done.

Regards.


On Sun, 31 Jan 2016 13:58:23 +
Steven Hartland <kill...@multiplay.co.uk> wrote:


Thanks for doing that it was very helpful, and I know transcribing from
video would have been quite a time consuming task.

I noticed a few interesting facts:
1. It looks like when you boot from ada0 and ada1 its still picking the
same device (according to device order).
Its not 100% clear as your devices are sata which Diff 6 didn't have
decoding for. I've added that now so hopefully we can confirm, also
added output of boot1 imgpath: so we can see what the EFI thinks the
boot device is.
2. Your usb device path has two message path entries which means
msg_paths_match would result in a false positive for usb devices, this
is now fixed by matching until we see a media path.

If you can now re-test the following two cases:
2) Boot from ada0 without USB memstick attached (2 drives).
3) Boot from ada1 without USB memstick attached (2 drives).

I'm interested to confirm two things:
1. If the boot1 imgpath lines do indeed vary
2. If the "load: '/boot/loader.efi'" line devpath matches the boot1 imgpath.

The changes from Diff 7 shouldn't effect the outcome of the other tests,
but confirming that too wouldn't hurt but no need for the output.

  Regards
  Steve
On 31/01/2016 11:43, Tomoaki AOKI wrote:

I found Diff6 you uploaded to PHABRICATOR. So my report below is based
on it.

The patched boot1.efi runs as expected (== as you wrote) for me.
*boot1.efi without -DEFI_DEBUG isn't tested. Needed?

As I mentioned in my previous post, I have no serial console
environment. So I took movie of limited tests and typed it.

Which of the tests are typed:

1) Boot from ada0 removing ada1, no USB memstick attached, ZFS in
   ada0 has loader.efi. (Single drive config.)

2) Boot from ada0, no USB memstick attached, ZFS in ada0 has
   loader.efi.

3) Boot from ada1, no USB memstick attached, ZFS in ada1 has
   loader.efi.

4) Boot from ada1, no USB memstick attached, only UFS in ada0 has
   loader.efi.

5) Boot from da0 (USB memstick with memstick.img of head).
   UFS in da0 has loader.efi and ZFS isn't present in da0.
   (3 drives [2 drives + USB memstick] config.)

Please see attached text for detail. Are more outputs needed?

Thanks in advance! I'm looking forward to see this MFC'ed before
releng/10.3 is branched. (Relies on imp@'s test?)

   *Will need to be in conjunction with changes after r294265 in head, as
currently Diff6 doesn't apply to stable/10.

Regards.


On Sat, 30 Jan 2016 19:12:34 +
Steven Hartland <kill...@multiplay.co.uk> wrote:


I believe, based on testing, that the from Diff 5 onwards of
https://reviews.freebsd.org/D5108 this should work as you expect it i.e.

If boot1 is loaded from a device which has either a UFS or ZFS bootable
install then this is the device that will be used to boot.

If said device has both then the ZFS setup will still be tried first.

If you can test in your setup and confirm either way that would be most
appreciated.

   Regards
   Steve

On 30/01/2016 06:57, Tomoaki AOKI wrote:

Thanks for your quick support!
I tried your patch [Diff1] (built with head r295032 world/kernel) and
now have good and bad news.

Good news is that without USB memstick boot1.efi runs as expected.
Great!

Bad news is that when booting from USB memstick (the one I used my
previous test, boot1.efi [bootx64.efi] and loader.efi is replaced) and
whichever of internal disk (ada[01]) have loader.efi in its ZFS pool,
ada[01] is booted instead of da0 (USB memstick).

 *If ada0 has loader.efi, always booted from ada0 (stable/10).
 *If ada0 doesn't have loader.efi and ada1 has, booted from ada1
  (head).
 *If both ada0 and ada1 don't have loader.efi, da0 (USB memstick) is
  booted (head, installer is invoked).

*Whichever ada[01] has loader.efi in their UFS or not didn't matter.

These behaviour would be because ZFS thoughout all disk

Re: ZFSROOT UEFI boot

2016-01-31 Thread Steven Hartland
Thanks for doing that it was very helpful, and I know transcribing from 
video would have been quite a time consuming task.


I noticed a few interesting facts:
1. It looks like when you boot from ada0 and ada1 its still picking the 
same device (according to device order).
Its not 100% clear as your devices are sata which Diff 6 didn't have 
decoding for. I've added that now so hopefully we can confirm, also 
added output of boot1 imgpath: so we can see what the EFI thinks the 
boot device is.
2. Your usb device path has two message path entries which means 
msg_paths_match would result in a false positive for usb devices, this 
is now fixed by matching until we see a media path.


If you can now re-test the following two cases:
2) Boot from ada0 without USB memstick attached (2 drives).
3) Boot from ada1 without USB memstick attached (2 drives).

I'm interested to confirm two things:
1. If the boot1 imgpath lines do indeed vary
2. If the "load: '/boot/loader.efi'" line devpath matches the boot1 imgpath.

The changes from Diff 7 shouldn't effect the outcome of the other tests, 
but confirming that too wouldn't hurt but no need for the output.


Regards
Steve
On 31/01/2016 11:43, Tomoaki AOKI wrote:

I found Diff6 you uploaded to PHABRICATOR. So my report below is based
on it.

The patched boot1.efi runs as expected (== as you wrote) for me.
   *boot1.efi without -DEFI_DEBUG isn't tested. Needed?

As I mentioned in my previous post, I have no serial console
environment. So I took movie of limited tests and typed it.

Which of the tests are typed:

   1) Boot from ada0 removing ada1, no USB memstick attached, ZFS in
  ada0 has loader.efi. (Single drive config.)

   2) Boot from ada0, no USB memstick attached, ZFS in ada0 has
  loader.efi.

   3) Boot from ada1, no USB memstick attached, ZFS in ada1 has
  loader.efi.

   4) Boot from ada1, no USB memstick attached, only UFS in ada0 has
  loader.efi.

   5) Boot from da0 (USB memstick with memstick.img of head).
  UFS in da0 has loader.efi and ZFS isn't present in da0.
  (3 drives [2 drives + USB memstick] config.)

Please see attached text for detail. Are more outputs needed?

Thanks in advance! I'm looking forward to see this MFC'ed before
releng/10.3 is branched. (Relies on imp@'s test?)

  *Will need to be in conjunction with changes after r294265 in head, as
   currently Diff6 doesn't apply to stable/10.

Regards.


On Sat, 30 Jan 2016 19:12:34 +0000
Steven Hartland <kill...@multiplay.co.uk> wrote:


I believe, based on testing, that the from Diff 5 onwards of
https://reviews.freebsd.org/D5108 this should work as you expect it i.e.

If boot1 is loaded from a device which has either a UFS or ZFS bootable
install then this is the device that will be used to boot.

If said device has both then the ZFS setup will still be tried first.

If you can test in your setup and confirm either way that would be most
appreciated.

  Regards
  Steve

On 30/01/2016 06:57, Tomoaki AOKI wrote:

Thanks for your quick support!
I tried your patch [Diff1] (built with head r295032 world/kernel) and
now have good and bad news.

Good news is that without USB memstick boot1.efi runs as expected.
Great!

Bad news is that when booting from USB memstick (the one I used my
previous test, boot1.efi [bootx64.efi] and loader.efi is replaced) and
whichever of internal disk (ada[01]) have loader.efi in its ZFS pool,
ada[01] is booted instead of da0 (USB memstick).

*If ada0 has loader.efi, always booted from ada0 (stable/10).
*If ada0 doesn't have loader.efi and ada1 has, booted from ada1
 (head).
*If both ada0 and ada1 don't have loader.efi, da0 (USB memstick) is
 booted (head, installer is invoked).

   *Whichever ada[01] has loader.efi in their UFS or not didn't matter.

These behaviour would be because ZFS thoughout all disks is tried
before trying UFS throughout all disks, if I understood correctly.

Changing boot order (ZFS to UFS per each disk, instead of each
ZFS to each UFS) would help.
But providing ZFS-disabled boot1.efi (boot1ufs.efi?) for installation
media (memstick, dvd, ...) helps, too. I built ZFS-disabled boot1.efi
and it worked fine for USB memstick for me.

   *`make clean && make -DMK_ZFS=no` in sys/boot/efi/boot1 didn't disabled
 ZFS module, so I must edit the definition of *boot_modules[] in
 boot1.c. I'd have been missing something.

Regards.


On Fri, 29 Jan 2016 02:58:26 +
Steven Hartland <kill...@multiplay.co.uk> wrote:


On 28/01/2016 16:22, Doug Rabson wrote:

On 28 January 2016 at 15:03, Tomoaki AOKI <junch...@dec.sakura.ne.jp> wrote:


It's exactly the NO GOOD point. The disk where boot1 is read from
should be where loader.efi and loader.conf are first read.


I just wanted to note that gptzfsboot and zfsboot behaves this way. Boot1
looks for loader in the pool which contains the disk that the BIOS booted.
It passes through the ID of that

Re: ZFSROOT UEFI boot

2016-01-30 Thread Steven Hartland
I believe, based on testing, that the from Diff 5 onwards of 
https://reviews.freebsd.org/D5108 this should work as you expect it i.e.


If boot1 is loaded from a device which has either a UFS or ZFS bootable 
install then this is the device that will be used to boot.


If said device has both then the ZFS setup will still be tried first.

If you can test in your setup and confirm either way that would be most 
appreciated.


Regards
Steve

On 30/01/2016 06:57, Tomoaki AOKI wrote:

Thanks for your quick support!
I tried your patch [Diff1] (built with head r295032 world/kernel) and
now have good and bad news.

Good news is that without USB memstick boot1.efi runs as expected.
Great!

Bad news is that when booting from USB memstick (the one I used my
previous test, boot1.efi [bootx64.efi] and loader.efi is replaced) and
whichever of internal disk (ada[01]) have loader.efi in its ZFS pool,
ada[01] is booted instead of da0 (USB memstick).

   *If ada0 has loader.efi, always booted from ada0 (stable/10).
   *If ada0 doesn't have loader.efi and ada1 has, booted from ada1
(head).
   *If both ada0 and ada1 don't have loader.efi, da0 (USB memstick) is
booted (head, installer is invoked).

  *Whichever ada[01] has loader.efi in their UFS or not didn't matter.

These behaviour would be because ZFS thoughout all disks is tried
before trying UFS throughout all disks, if I understood correctly.

Changing boot order (ZFS to UFS per each disk, instead of each
ZFS to each UFS) would help.
But providing ZFS-disabled boot1.efi (boot1ufs.efi?) for installation
media (memstick, dvd, ...) helps, too. I built ZFS-disabled boot1.efi
and it worked fine for USB memstick for me.

  *`make clean && make -DMK_ZFS=no` in sys/boot/efi/boot1 didn't disabled
ZFS module, so I must edit the definition of *boot_modules[] in
boot1.c. I'd have been missing something.

Regards.


On Fri, 29 Jan 2016 02:58:26 +0000
Steven Hartland <kill...@multiplay.co.uk> wrote:


On 28/01/2016 16:22, Doug Rabson wrote:

On 28 January 2016 at 15:03, Tomoaki AOKI <junch...@dec.sakura.ne.jp> wrote:


It's exactly the NO GOOD point. The disk where boot1 is read from
should be where loader.efi and loader.conf are first read.


I just wanted to note that gptzfsboot and zfsboot behaves this way. Boot1
looks for loader in the pool which contains the disk that the BIOS booted.
It passes through the ID of that pool to loader which uses that pool as the
default for loading kernel and modules. I believe this is the correct
behaviour. For gptzfsboot and zfsboot, it is possible to override by
pressing space at the point where it is about to load loader.

I believe I understand at least some of your issue now, could you please
test the code on the following review to see if it fixes your issue please:
https://reviews.freebsd.org/D5108

  Regards
  Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"





___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFSROOT UEFI boot

2016-01-30 Thread Steven Hartland
I believe, based on testing, that the from Diff 5 onwards of 
https://reviews.freebsd.org/D5108 this should work as you expect it i.e.


If boot1 is loaded from a device which has either a UFS or ZFS bootable 
install then this is the device that will be used to boot.


If said device has both then the ZFS setup will still be tried first.

If you can test in your setup and confirm either way that would be most 
appreciated.


Regards
Steve

On 30/01/2016 06:57, Tomoaki AOKI wrote:

Thanks for your quick support!
I tried your patch [Diff1] (built with head r295032 world/kernel) and
now have good and bad news.

Good news is that without USB memstick boot1.efi runs as expected.
Great!

Bad news is that when booting from USB memstick (the one I used my
previous test, boot1.efi [bootx64.efi] and loader.efi is replaced) and
whichever of internal disk (ada[01]) have loader.efi in its ZFS pool,
ada[01] is booted instead of da0 (USB memstick).

   *If ada0 has loader.efi, always booted from ada0 (stable/10).
   *If ada0 doesn't have loader.efi and ada1 has, booted from ada1
(head).
   *If both ada0 and ada1 don't have loader.efi, da0 (USB memstick) is
booted (head, installer is invoked).

  *Whichever ada[01] has loader.efi in their UFS or not didn't matter.

These behaviour would be because ZFS thoughout all disks is tried
before trying UFS throughout all disks, if I understood correctly.

Changing boot order (ZFS to UFS per each disk, instead of each
ZFS to each UFS) would help.
But providing ZFS-disabled boot1.efi (boot1ufs.efi?) for installation
media (memstick, dvd, ...) helps, too. I built ZFS-disabled boot1.efi
and it worked fine for USB memstick for me.

  *`make clean && make -DMK_ZFS=no` in sys/boot/efi/boot1 didn't disabled
ZFS module, so I must edit the definition of *boot_modules[] in
boot1.c. I'd have been missing something.

Regards.


On Fri, 29 Jan 2016 02:58:26 +0000
Steven Hartland <kill...@multiplay.co.uk> wrote:


On 28/01/2016 16:22, Doug Rabson wrote:

On 28 January 2016 at 15:03, Tomoaki AOKI <junch...@dec.sakura.ne.jp> wrote:


It's exactly the NO GOOD point. The disk where boot1 is read from
should be where loader.efi and loader.conf are first read.


I just wanted to note that gptzfsboot and zfsboot behaves this way. Boot1
looks for loader in the pool which contains the disk that the BIOS booted.
It passes through the ID of that pool to loader which uses that pool as the
default for loading kernel and modules. I believe this is the correct
behaviour. For gptzfsboot and zfsboot, it is possible to override by
pressing space at the point where it is about to load loader.

I believe I understand at least some of your issue now, could you please
test the code on the following review to see if it fixes your issue please:
https://reviews.freebsd.org/D5108

  Regards
  Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"





___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFSROOT UEFI boot

2016-01-30 Thread Steven Hartland
I did some more work on the review last night, if you could apply the
latest patch set diff4 to see if that helps.

If not compile with debugging using -DEFI_DEBUG on your make line then you
will get a lot more information about which disk is being used to load from
as well as info about the probe order.

What you should see is that the disk you boot from (where boot1 is loaded
from) should be probed first and hence get flagged as successful
(preferred).

This also shows up as * instead of + in the non-debug boot process.

If this happens you should see loader.efi loaded from this disk and then
the kernel.

The debug output is verbose so you may need a serial console to be able to
capture the output easily.

Thanks for testing so far hopefully we can nail this soon 
On Saturday, 30 January 2016, Tomoaki AOKI <junch...@dec.sakura.ne.jp>
wrote:

> Thanks for your quick support!
> I tried your patch [Diff1] (built with head r295032 world/kernel) and
> now have good and bad news.
>
> Good news is that without USB memstick boot1.efi runs as expected.
> Great!
>
> Bad news is that when booting from USB memstick (the one I used my
> previous test, boot1.efi [bootx64.efi] and loader.efi is replaced) and
> whichever of internal disk (ada[01]) have loader.efi in its ZFS pool,
> ada[01] is booted instead of da0 (USB memstick).
>
>   *If ada0 has loader.efi, always booted from ada0 (stable/10).
>   *If ada0 doesn't have loader.efi and ada1 has, booted from ada1
>(head).
>   *If both ada0 and ada1 don't have loader.efi, da0 (USB memstick) is
>booted (head, installer is invoked).
>
>  *Whichever ada[01] has loader.efi in their UFS or not didn't matter.
>
> These behaviour would be because ZFS thoughout all disks is tried
> before trying UFS throughout all disks, if I understood correctly.
>
> Changing boot order (ZFS to UFS per each disk, instead of each
> ZFS to each UFS) would help.
> But providing ZFS-disabled boot1.efi (boot1ufs.efi?) for installation
> media (memstick, dvd, ...) helps, too. I built ZFS-disabled boot1.efi
> and it worked fine for USB memstick for me.
>
>  *`make clean && make -DMK_ZFS=no` in sys/boot/efi/boot1 didn't disabled
>ZFS module, so I must edit the definition of *boot_modules[] in
>boot1.c. I'd have been missing something.
>
> Regards.
>
>
> On Fri, 29 Jan 2016 02:58:26 +
> Steven Hartland <kill...@multiplay.co.uk <javascript:;>> wrote:
>
> > On 28/01/2016 16:22, Doug Rabson wrote:
> > > On 28 January 2016 at 15:03, Tomoaki AOKI <junch...@dec.sakura.ne.jp
> <javascript:;>> wrote:
> > >
> > >> It's exactly the NO GOOD point. The disk where boot1 is read from
> > >> should be where loader.efi and loader.conf are first read.
> > >>
> > > I just wanted to note that gptzfsboot and zfsboot behaves this way.
> Boot1
> > > looks for loader in the pool which contains the disk that the BIOS
> booted.
> > > It passes through the ID of that pool to loader which uses that pool
> as the
> > > default for loading kernel and modules. I believe this is the correct
> > > behaviour. For gptzfsboot and zfsboot, it is possible to override by
> > > pressing space at the point where it is about to load loader.
> >
> > I believe I understand at least some of your issue now, could you please
> > test the code on the following review to see if it fixes your issue
> please:
> > https://reviews.freebsd.org/D5108
> >
> >  Regards
> >  Steve
> > ___
> > freebsd-current@freebsd.org <javascript:;> mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "
> freebsd-current-unsubscr...@freebsd.org <javascript:;>"
> >
>
>
> --
> 青木 知明  [Tomoaki AOKI]
> junch...@dec.sakura.ne.jp <javascript:;>
> mxe02...@nifty.com <javascript:;>
> ___
> freebsd-current@freebsd.org <javascript:;> mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org
> <javascript:;>"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ZFSROOT UEFI boot

2016-01-30 Thread Steven Hartland
I just realised an important point, does your usb disk have a UFS
root partition and your internal disk ZFS root partition?

If so then I know what the issue is, I'll have quick look now, so wait for
a diff5 to appear before testing.
On Saturday, 30 January 2016, Steven Hartland <ste...@multiplay.co.uk>
wrote:

> I did some more work on the review last night, if you could apply the
> latest patch set diff4 to see if that helps.
>
> If not compile with debugging using -DEFI_DEBUG on your make line then you
> will get a lot more information about which disk is being used to load from
> as well as info about the probe order.
>
> What you should see is that the disk you boot from (where boot1 is loaded
> from) should be probed first and hence get flagged as successful
> (preferred).
>
> This also shows up as * instead of + in the non-debug boot process.
>
> If this happens you should see loader.efi loaded from this disk and then
> the kernel.
>
> The debug output is verbose so you may need a serial console to be able to
> capture the output easily.
>
> Thanks for testing so far hopefully we can nail this soon 
> On Saturday, 30 January 2016, Tomoaki AOKI <junch...@dec.sakura.ne.jp
> <javascript:_e(%7B%7D,'cvml','junch...@dec.sakura.ne.jp');>> wrote:
>
>> Thanks for your quick support!
>> I tried your patch [Diff1] (built with head r295032 world/kernel) and
>> now have good and bad news.
>>
>> Good news is that without USB memstick boot1.efi runs as expected.
>> Great!
>>
>> Bad news is that when booting from USB memstick (the one I used my
>> previous test, boot1.efi [bootx64.efi] and loader.efi is replaced) and
>> whichever of internal disk (ada[01]) have loader.efi in its ZFS pool,
>> ada[01] is booted instead of da0 (USB memstick).
>>
>>   *If ada0 has loader.efi, always booted from ada0 (stable/10).
>>   *If ada0 doesn't have loader.efi and ada1 has, booted from ada1
>>(head).
>>   *If both ada0 and ada1 don't have loader.efi, da0 (USB memstick) is
>>booted (head, installer is invoked).
>>
>>  *Whichever ada[01] has loader.efi in their UFS or not didn't matter.
>>
>> These behaviour would be because ZFS thoughout all disks is tried
>> before trying UFS throughout all disks, if I understood correctly.
>>
>> Changing boot order (ZFS to UFS per each disk, instead of each
>> ZFS to each UFS) would help.
>> But providing ZFS-disabled boot1.efi (boot1ufs.efi?) for installation
>> media (memstick, dvd, ...) helps, too. I built ZFS-disabled boot1.efi
>> and it worked fine for USB memstick for me.
>>
>>  *`make clean && make -DMK_ZFS=no` in sys/boot/efi/boot1 didn't disabled
>>ZFS module, so I must edit the definition of *boot_modules[] in
>>boot1.c. I'd have been missing something.
>>
>> Regards.
>>
>>
>> On Fri, 29 Jan 2016 02:58:26 +
>> Steven Hartland <kill...@multiplay.co.uk> wrote:
>>
>> > On 28/01/2016 16:22, Doug Rabson wrote:
>> > > On 28 January 2016 at 15:03, Tomoaki AOKI <junch...@dec.sakura.ne.jp>
>> wrote:
>> > >
>> > >> It's exactly the NO GOOD point. The disk where boot1 is read from
>> > >> should be where loader.efi and loader.conf are first read.
>> > >>
>> > > I just wanted to note that gptzfsboot and zfsboot behaves this way.
>> Boot1
>> > > looks for loader in the pool which contains the disk that the BIOS
>> booted.
>> > > It passes through the ID of that pool to loader which uses that pool
>> as the
>> > > default for loading kernel and modules. I believe this is the correct
>> > > behaviour. For gptzfsboot and zfsboot, it is possible to override by
>> > > pressing space at the point where it is about to load loader.
>> >
>> > I believe I understand at least some of your issue now, could you please
>> > test the code on the following review to see if it fixes your issue
>> please:
>> > https://reviews.freebsd.org/D5108
>> >
>> >  Regards
>> >  Steve
>> > ___
>> > freebsd-current@freebsd.org mailing list
>> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> > To unsubscribe, send any mail to "
>> freebsd-current-unsubscr...@freebsd.org"
>> >
>>
>>
>> --
>> 青木 知明  [Tomoaki AOKI]
>> junch...@dec.sakura.ne.jp
>> mxe02...@nifty.com
>> ___
>> freebsd-current@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org
>> "
>>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ZFSROOT UEFI boot

2016-01-28 Thread Steven Hartland

On 28/01/2016 16:22, Doug Rabson wrote:

On 28 January 2016 at 15:03, Tomoaki AOKI  wrote:


It's exactly the NO GOOD point. The disk where boot1 is read from
should be where loader.efi and loader.conf are first read.


I just wanted to note that gptzfsboot and zfsboot behaves this way. Boot1
looks for loader in the pool which contains the disk that the BIOS booted.
It passes through the ID of that pool to loader which uses that pool as the
default for loading kernel and modules. I believe this is the correct
behaviour. For gptzfsboot and zfsboot, it is possible to override by
pressing space at the point where it is about to load loader.


I believe I understand at least some of your issue now, could you please 
test the code on the following review to see if it fixes your issue please:

https://reviews.freebsd.org/D5108

Regards
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFSROOT UEFI boot

2016-01-24 Thread Steven Hartland

On 24/01/2016 12:53, Tomoaki AOKI wrote:

Unfortunately, this (and its committed successor and original for UFS)
fails to boot in some situation, like below. OTOH, gptzfsboot (and
maybe gptboot for UFS, too) is OK.

When I select Disk1 from UEFI firmware, bootx64.efi in Disk1 EFI
partition is used and it searches /boot/loader.efi from Disk2 (in ZFS,
if none, in UFS) only.
And when I select Disk2, bootx64.efi in Disk2 EFI partition is used and
it searches /boot/loader.efi from Disk1 only.


In fact, this is a long-standing and living problem.
At past, USB memstick with head memstick.img (UEFI enabled, but
without root-on-ZFS support) booted fine, but after I added UFS2
partition in internal disk, the USB memstick didn't boot anymore.
It searches /boot/loader.efi from internal UFS and fails as it was
blank (only newfs'ed) at that time. Another USB memstick with stable/10
memstick.img is still fine, as it's still ancient BIOS based.

Possibly, it's not a fault of boot1.efi but caused by differense in
implementation of UEFI firmware. If that's it, different boot1.efi
would be needed for each implementation.

A bit more details of tests are as below. Not all combinations are
covered, but would be sufficient to determine above conclusion.


Common configurations for all tests:
   *Each disk has one EFI partition (p1), one freebsd-boot partition
(p2), one swap partition (p3), one UFS partition (p4), and one
ZFS pool (p5) with this order.

   *Each partition has different GEOM label.

   *In each disk, FreeBSD is installed as root on ZFS. No other OS.

   *stable/10 (r294614) is installed in Disk1.

   *head (r294567) is installed in Disk2.

   *ZFS-enabled boot1.efi (head r294567) is used as bootx64.efi.


Set 1: Boot from Disk1 (select it in UEFI firmware).
In all tests, /boot/loader.efi in Disk1 (both UFS and ZFS)
are NOT searched at all.

Could you clarify what you mean by this?

When looking performing the scan boot1 uses the following coding:
* "+" = partition probe success (potential boot partition)
* "." = partition probe unsupported (valid partition not detected)
* "x" = partition probe error (unexpected error)

  1-1) Both UFS and ZFS has no /boot/loader.efi
 -> Fail to boot. Fall back to boot1 prompt.

This is expected

  1-2) Disk2 UFS only has /boot/loader.efi, whole /boot of Disk2 ZFS
   is copied to UFS.
 -> head in Disk2 boots fine.

What do you mean by "whole /boot of Disk2 ZFS is copied to UFS"?

  1-3) Same as 1-2, except its /boot/loader.efi is overwritten by the
   one of stable/10.
 -> head in Disk2 boots fine, as loader.efi loads kernel from
/boot/kernel/kernel in UFS and kernel with zfs.ko can mount
root on ZFS specified by vfs.root.mountfrom.

  1-4) Disk2 UFS only has /boot/loader.efi, whole /boot of Disk1 ZFS
   is copied to UFS and its /boot/loader.efi is overwritten by
   the one of head.
 -> stable/10 in Disk1 ZFS boots fine.

  1-5) Disk2 ZFS only has /boot/loader.efi.
 -> head in Disk2 ZFS boots fine.

  1-6) Both UFS and ZFS in Disk2 has /boot/loader.efi.
   (Mix of 1-4 and 1-5)
 -> head in Disk2 ZFS boots fine.


Set 2: Boot from Disk2 (select it in UEFI firmware).
In all tests, /boot/loader.efi in Disk2 (both UFS and ZFS)
are NOT searched at all.

  2-1) Both UFS and ZFS has no /boot/loader.efi
 -> Fail to boot. Fall back to boot1 prompt.
ZFS pool in Disk2 is shown before one in Disk1.

  2-2) Disk1 UFS only has /boot/loader.efi, whole /boot of Disk2 ZFS
   is copied to UFS.
 -> head in Disk2 ZFS boots fine.

  2-3) Disk1 UFS only has /boot/loader.efi, whole /boot of Disk1 ZFS
   is copied to UFS.
 -> stable/10 in Disk1 ZFS boots fine, as loader.efi loads
kernel from /boot/kernel/kernel in UFS and kernel with zfs.ko
can mount root on ZFS specified by vfs.root.mountfrom.

  2-4) Disk1 UFS only has /boot/loader.efi, whole /boot of Disk1 ZFS
   is copied to UFS and its /boot/loader.efi is overwritten by
   the one of head.
 -> stable/10 in Disk1 ZFS boots fine.

  2-5) Disk1 ZFS only has /boot/loader.efi of stable/10 itself.
 -> Fail to boot. Fall back to boot1 prompt.
ZFS pool in Disk2 is shown before one in Disk1.

  2-6) Disk1 ZFS only has /boot/loader.efi of head.
 -> stable/10 in Disk1 ZFS boots fine.

  2-7) Both UFS and ZFS in Disk1 has /boot/loader.efi of head.
   (Mix of 2-2 and 2-6)
 -> stable/10 in Disk1 ZFS boots fine.

  2-8) UFS has /boot/loader.efi of head (head kernel copied), but ZFS
   has /boot/loader.efi of stable/10 itself. (Mix of 2-2 and 2-5)
 -> Same as 2-5. Fail to boot. Fall back to boot1 prompt.
ZFS pool in Disk2 is shown before one in Disk1.

Set 3: Disk2 is removed. (Disk1 only environment)

   3-1) ZFS only has /boot/loader.efi of head.
 -> stable/10 in Disk1 ZFS boots 

Re: r294248: boot stuck: EFI loader doesn't proceed

2016-01-18 Thread Steven Hartland

On 18/01/2016 21:57, Ed Maste wrote:

On 18 January 2016 at 15:26, Andrew Turner  wrote:

the issue was we were caching reads from the first filesystem we looked
at. I've committed the fix in r294291 to force the code to re-read on
each filesystem it looks at.

Thanks Andrew - my QEMU boot failure is not reproducible on a build
from a clean tree with this change included.
I still think there are issues in there, specifically the lookup call in 
try_load, which calls fsread should be getting bad results.


I believe the code here https://reviews.freebsd.org/D4989 fixes that as 
well as eliminating fsstat and all the duplicated code.


Regards
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r294248: boot stuck: EFI loader doesn't proceed

2016-01-18 Thread Steven Hartland



On 18/01/2016 19:08, Ed Maste wrote:

On 18 January 2016 at 03:10, Dimitry Andric  wrote:

On 18 Jan 2016, at 07:20, O. Hartmann  wrote:

Building NanoBSD images booting off from USB Flash drives and having two GPT
partitions, booting is stuck in the UEFI loader, presenting me with something
like:

[...]
Probing 6 block devices.++. done

  ZFS found no pools
  UFS found 2 partitions

And further nothing happens. A RESET is only possible by a hardreset - it seems
the system is crashed/stuck/frozen or something similar.

The last images working run r293654. The issue occurs with r294248.

Any suggestions possible? Did I miss something?

Looks to me like fallout from the recent modularisation in r294060,
and/or ZFS support in r294068.  Steven, any clue?

In QEMU boot1 failed for me with "Failed start image provided by UFS",
and I can confirm that it's fixed by reverting those two commits.
I believe this is an issue with UFS caching code introduced by the UFS 
modularisation.


Andrew fixed some of it but not all in r294291, I believe the lookup 
results in try_load would still have been invalid.


https://reviews.freebsd.org/D4989 should fix the rest, as well as 
resulting in much simpler flow IMO.


If you could try this that would be great.

Regards
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBsd MCA Panic Crash !!

2016-01-04 Thread Steven Hartland
Bank 5 seems to be common to all the crashes, which may suggest you have 
some dodgy ram or possibly the driving CPU's memory controller.


As the error says this is a Hardware issue.

One thing we've used in the past to narrow issues like this down is to 
remove as much RAM as possible and to disable all but one CPU core using 
/boot/loader.conf hints, where X is the the number of CPU core to 
disable as reported by the boot process.

hint.lapic.X.disabled="1"

Regards
Steve

On 04/01/2016 10:34, shahzaibcb wrote:

Hi,

We've switched to FreeBSD recently to accomodate large video storage as we
are running video streaming website. So the job of the FreeBSD is to
transcode the uploaded videos using ffmpeg and serve them to users via nginx
webserver but so far our experience is not very good with it. It crashes
every 2-3 days and we're unable to track down the problem. The server specs
are pretty high :


Supermicro X5690 (12 cores, 24 threads - 2u)
96GB RAM
12x3TB RAID-10 (HBA-LSI9211)

Here is the screenshot of recent crash :

http://prntscr.com/9er3pk

One thing worth mentioning is, before going down there's no load on server,
more or less free RAM usually is around 12GB.  We've tried following
solutions so far :


- Updated FreeBSD OS
- Replaced 800W PS with 900W
- We've reduced CMOS from MAX(26x) to 18x as suggested in this post
http://unix.stackexchange.com/questions/60574/determining-cause-of-linux-kernel-panic

The solution we've not performed so far is :

- Disable mca using (hw.mca.enabled: 0) - As we're getting MCA panics.

Here is the crash dump :

[root@cw001 /var/crash]# mcelog --no-dmi --ascii --file core.txt.1
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 3 BANK 5
MISC 0 ADDR 802bf6a69
MCG status:MCIP
MCi status:
Uncorrected error
Error enabled
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Internal Timer error
STATUS be800400 MCGSTATUS 4
MCGCAP 1c09 APICID 3 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 2 BANK 5
MISC 0 ADDR 802bf6a69
MCG status:MCIP
MCi status:
Uncorrected error
Error enabled
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Internal Timer error
STATUS be800400 MCGSTATUS 4
MCGCAP 1c09 APICID 2 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 3 BANK 5
MISC 0 ADDR 802bf6a69
MCG status:MCIP
MCi status:
Uncorrected error
Error enabled
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Internal Timer error
STATUS be800400 MCGSTATUS 4
MCGCAP 1c09 APICID 3 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 2 BANK 5
MISC 0 ADDR 802bf6a69
MCG status:MCIP
MCi status:
Uncorrected error
Error enabled
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: Internal Timer error
STATUS be800400 MCGSTATUS 4
MCGCAP 1c09 APICID 2 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44

---

I showed those Hardware errors to Vendor from whom we purchased Supermicro
servers . This is what he has to say :

---
Why do you not made one test environment with CentOS or one other Linux that
you know to use, and see if you have same errors ??? if not than you know
that the errors come from OS not from hardware. ( CentOS, RedHead….work
diferend like FreeBSD – work direct on hardware if you don’t have the right
kernel settings can the server crashed. CentOS , RedHead…. don’t work direct
on hardware and distribute the resource load better and you have better
control and you can better debug one situation)
---

Now we're on a black hole and unable to find that either issue with FreeBSD
or Hardware. We're thinking to disable mca in loader.conf but ppl are not
suggesting it. If you guys can help us, it'd be very kind.



--
View this message in context: 
http://freebsd.1045724.n5.nabble.com/FreeBsd-MCA-Panic-Crash-tp6064691.html
Sent from the freebsd-current mailing list archive at Nabble.com.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ZFSROOT UEFI boot

2015-12-10 Thread Steven Hartland
Ive literally just got this working on 10.2 after working on the code
posted on the review which you can find here:
https://reviews.freebsd.org/D4104

If you're happy running current then the patch file I linked in my comment
should apply cleanly and just work.

If you want 10.x then there's quite a bit more needed. As I said I do have
this working so can post patches when I'm back in the office.

Either way once applied a standard efi install just works. Essentially
create efi partition and use gpart to install the efi bootcode and away you
go.

I've just used this with a custom mfsbsd iso to perform and 10.2-RELEASE
ZFS boot install on some Intel nvme disks setup as raidz2, which only
support efi boot.

On 10 Dec 2015, at 12:18, krad  wrote:

Hi, I need to get one of my machines converted over from bios GPT zfsroot
boot to efi. I know you can boot freebsd under EFI with a ufs kernel but
this isnt the route i want. There are patches under test for EFI zfs root.
However when I read the thread it was unclear which version of these
patches were needed and where to get them. Does anyone know where they are,
if there are any prebuilt zfsloader etc binaries, or if the patches have
made it to head yet?

Also does anyone have any pointers or good experience with grub efi and zfs
on root? I'm considering this option as it would make booting into specific
boot environments easier
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Increase BUFSIZ to 8192

2015-05-12 Thread Steven Hartland

4k block size on the underlying device?

On 12/05/2015 00:14, Adrian Chadd wrote:

So I'm curious - why's it faster?


-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Fix for r281680 -- broke i386 world

2015-04-18 Thread Steven Hartland



On 18/04/2015 17:30, David Wolfskill wrote:

On Sat, Apr 18, 2015 at 06:34:59PM +0300, Konstantin Belousov wrote:

...

-   printf(LE_STATUS: %d %d %lx\n, e, rp.status, rp.le_status);
+   printf(LE_STATUS: %d %d %jx\n, e, rp.status, rp.le_status);
  
  	return 0;

  }

The j modificator specifies that the type of the argument is (u)intmax_t.
It is only a coincidense that uint64_t is max integer type, the arg should
be casted to uintmax_t.

Could you, please, update and test ?


Thank you for the correction; the attached patch survives both i386 
amd64 make buildworld ... and comes a bit closer to the above
specification.  (I had tried (uintmax_t)rp.le_features at first; that
failed (at least on amd64), with:

--- usr.sbin.all__D ---
/usr/src/usr.sbin/bluetooth/hccontrol/le.c:236:15: error: expected ')'
 (uintmax_t)rp.le_features);
^
/usr/src/usr.sbin/bluetooth/hccontrol/le.c:235:8: note: to match this '('
 printf(LOCAL SUPPORTED: %d %d %ju\n, e, rp.status,
   ^
/usr/src/usr.sbin/bluetooth/hccontrol/le.c:253:60: error: expected ')'
 printf(LE_STATUS: %d %d %jx\n, e, rp.status, 
(uintmax_t)rp.le_status);
   ^
/usr/src/usr.sbin/bluetooth/hccontrol/le.c:253:8: note: to match this '('
 printf(LE_STATUS: %d %d %jx\n, e, rp.status, 
(uintmax_t)rp.le_status);
   ^
2 errors generated.

So I took a bit of evasive action.)

The errors not very good, but I'm guessing your missing #include 
stdint.h for uintmax_t where as u_int64_t is from sys/types.h iirc.


Regards
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: T40 bootloop on CAM status: Command timeout on both 10.1 and -CURRENT

2015-03-31 Thread Steven Hartland
Verbose boot of the working OS version was what I was after, which 
should provide details of what its detecting ;-)


Also a camcontrol identify of the cdrom from the working version may 
also be useful


On 31/03/2015 09:54, Pietro Sammarco wrote:
Currently the cdrom drive and yes it works fine till 10 with the 
legacy ATA stack. Verbose boot doesn't give out any errors or logs 
beside what's shown in the picture I have attached with the first email.


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: T40 bootloop on CAM status: Command timeout on both 10.1 and -CURRENT

2015-03-30 Thread Steven Hartland

Is there anything connected to the second channel, if so what.

Was this working in a previous version and if so which one and what was 
the verbose boot log from it?


On 30/03/2015 19:07, bsdml wrote:

Hello Kevin,

thanks for your clarification. Unfortunately I wasn't aware that the 
T40 and *4's line used a SATA-PATA convertor and especially that was 
going to clash with the new ATA stack in FreeBSD. Either OpenBSD and 
NetBSD do work out of the box without any hassle, however I'd still 
prefer to use FreeBSD on it as I have been using FreeBSD for about 8 
years now and I am very comfortable with it.


The question at this point is, is there any hope to see this issue 
resolved in the future? Or will I have to give up to the second ATA 
channel in order to use FreeBSD?


Regards,
Pietro Sammarco

On 30/03/2015 06:17, Kevin Oberman wrote:
On Sun, Mar 29, 2015 at 2:27 AM, Wolfgang Zenker 
wolfg...@lyxys.ka.sub.org mailto:wolfg...@lyxys.ka.sub.org wrote:


Hi,

* bsdml pietro.bs...@gmail.com mailto:pietro.bs...@gmail.com
[150329 01:34]:
 since I tried to install FreeBSD 10.1 on my recently purchased 
T40 I got

 stuck at this annoying bootloop that says
 ATAPY_IDENTIFY. ACB: a1 00 00 00 00 40 00 00 00 etc etc.. CAM
status:
 Command timeout. I have also tried latest 11-CURRENT snapshot
and it
 did not make any difference at all, it is affected from the same
exact
 bootloop.
 [..]
 It seems like there might be an issue with the CAM ATA stack 
that is

 clashing with the PATA controller on my T40.

I had the same problem on an ancient T42p. In my case, disabling the
second ata channel allowed me to boot.

I added the following line to /boot/device.hints:
hint.ata.1.disabled=1


This is an annoying side-effect of the brain-dead SATA-PATA converter 
in that generation of ThinkPads. The Intel ICH6 chipset is SATA, but, 
for reasons known ot IBM/Lenovo, the systems used PATA drives! So 
they has a SATA-PATA converter built in that screwed up a LOT of 
things, mostly compromising performance and generating assorted log 
entries. Looks like that also is broken in modern ATA support if a 
drive is not present.


This was always my biggest complaint with this laptop (T42) which I 
used for several years until I retired and returned to so it could be 
excessed legally as it was government property (and, I didn't really 
want it, even if I could have kept it). Not an awful system, but this 
one issue was really annoying to me.

--
Kevin Oberman, Network Engineer, Retired
E-mail: rkober...@gmail.com mailto:rkober...@gmail.com


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: who broke dtrace and buildworld?

2015-01-17 Thread Steven Hartland


On 18/01/2015 00:22, Steve Kargl wrote:

On Sat, Jan 17, 2015 at 03:54:47PM -0800, Steve Kargl wrote:

% cd /usr/src
% svn update
Updating '.':
At revision 277307.
% make buildworld


=== cddl/usr.sbin/dtrace (all)
cc  -O2 -pipe -march=core2  -I/usr/src/cddl/usr.sbin/dtrace/../../../sys/cddl/compat/opensolaris  -I/usr/src/cddl/usr.sbin/dtrace/../../../cddl/compat/opensolaris/include  -I/usr/src/cddl/usr.sbin/dtrace/../../../cddl/contrib/opensolaris/head  -I/usr/src/cddl/usr.sbin/dtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common  -I/usr/src/cddl/usr.sbin/dtrace/../../../cddl/contrib/opensolaris/lib/libproc/common  -I/usr/src/cddl/usr.sbin/dtrace/../../../sys/cddl/contrib/opensolaris/uts/common  -I/usr/src/cddl/usr.sbin/dtrace/../../../sys/cddl/contrib/opensolaris/compat -DNEED_SOLARIS_BOOLEAN -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wno-pointer-sign -Wno-unknown-pragmas -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable -Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion -Wno-switch -Wno-switch-enum -Wno-knr-promoted-parameter -Wno-parentheses -Qunused-arguments  -o dtrace dtrace.o -ldtrace -ly 

-ll

   -lproc -lctf -lelf -lz -lutil -lrtld_db -lpthread
/usr/obj/usr/src/tmp/usr/lib/libdtrace.so: undefined reference to 
`dt_idstack_lookup'
/usr/obj/usr/src/tmp/usr/lib/libdtrace.so: undefined reference to 
`dt_idops_probe'
/usr/obj/usr/src/tmp/usr/lib/libdtrace.so: undefined reference to 
`dt_idhash_nextid'
/usr/obj/usr/src/tmp/usr/lib/libdtrace.so: undefined reference to 
`dt_idhash_create'
/usr/obj/usr/src/tmp/usr/lib/libdtrace.so: undefined reference to 
`dt_idhash_lookup'
/usr/obj/usr/src/tmp/usr/lib/libdtrace.so: undefined reference to 
`dt_idops_type'
/usr/obj/usr/src/tmp/usr/lib/libdtrace.so: undefined reference to 
`dt_idops_thaw'
/usr/obj/usr/src/tmp/usr/lib/libdtrace.so: undefined reference to `dt_idops_args

Please fix.


To fix the build,

% svn revert -r 377300:377299 .
Just double checked and I can building r277307 without issue, build box 
is running 10.1-RELEASE.


My head box is quite a bit slower and is still running, but it did 
complete a full buildworld on what is r277300 before it was committed so 
no reason to think it wont complete.

...
--- ldd32 ---
cc -m32 -march=i686 -mmmx -msse -msse2 -DCOMPAT_32BIT  -isystem 
/usr/obj/usr/home/smh/freebsd/base/head/lib32/usr/include/ 
-L/usr/obj/usr/home/smh/freebsd/base/head/lib32/usr/lib32 
-B/usr/obj/usr/home/smh/freebsd/base/head/lib32/usr/lib32 -O2 -pipe 
-std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall 
-Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes 
-Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual 
-Wwrite-strings -Wswitch -Wshadow -Wunused-parameter -Wcast-align 
-Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls 
-Wold-style-definition -Wno-pointer-sign -Wmissing-variable-declarations 
-Wthread-safety -Wno-empty-body -Wno-string-plus-int 
-Wno-unused-const-variable -Qunused-arguments -o ldd32 ldd.o sods.o

--- buildworld_epilogue ---
--
 World build completed on Sun Jan 18 01:31:27 UTC 2015
--

svn info |grep Revision
Revision: 277307

Is anyone else seeing this issue?

Regards
Steve

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: who broke dtrace and buildworld?

2015-01-17 Thread Steven Hartland


On 18/01/2015 02:35, Steve Kargl wrote:

On Sun, Jan 18, 2015 at 02:10:19AM +, Steven Hartland wrote:

On 18/01/2015 01:44, Steve Kargl wrote:

On Sun, Jan 18, 2015 at 01:40:09AM +, Steven Hartland wrote:

% svn revert -r 377300:377299 .

Just double checked and I can building r277307 without issue, build box
is running 10.1-RELEASE.

My head box is quite a bit slower and is still running, but it did
complete a full buildworld on what is r277300 before it was committed so
no reason to think it wont complete.

My laptop is running

% uname -a (with some editing)
FreeBSD 11.0-CURRENT r275646: Tue Dec  9 12:23:30 PST 2014

and I understand the a bit slower statement as it takes 5+ hours
to buildworld on my laptop.

Note sure if it matters, but I'm building i386 not amd64.

I just replaced my make.conf and src.conf with the ones you posted and
am retested and again the build completes.

tinderbox being based off universe just with error reporting so tested
buildworld and buildkernel for all arch's so I can't see i386 being an
issue either, but I'm testing now with TARGET=i386 just be be sure.

Could you verify you don't have something stale or a bad checkout?

I did all of the checking before I sent the first email (including
multiple 'svn update' and 'svn status').  The tree before reverting
your patch was an up-to-date head without any other patches.  I
use neither ccache nor -DNOCLEAN and use 'rm -rf /usr/obj/*' to
clean out OBJDIR.  Without your patch installed everything completed
as I expected (well, I did hit the MCA_system issue), and updated
my system.  I'm now trying to again rebuild buildworld from scratch.
This is going to take awhile.

buildworld with TARGET=i386 worked fine as did standard amd64 on head

...lsqlite3   -lz  -lcrypto  -lssl  -lpthread
--- buildworld_epilogue ---
--
 World build completed on Sun Jan 18 02:23:24 UTC 2015
--

...uments  -o ldd32 ldd.o sods.o
--- buildworld_epilogue ---
--
 World build completed on Sun Jan 18 02:40:26 UTC 2015
--

Both from Revision: 277307
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: who broke dtrace and buildworld?

2015-01-17 Thread Steven Hartland


On 18/01/2015 01:44, Steve Kargl wrote:

On Sun, Jan 18, 2015 at 01:40:09AM +, Steven Hartland wrote:

% svn revert -r 377300:377299 .

Just double checked and I can building r277307 without issue, build box
is running 10.1-RELEASE.

My head box is quite a bit slower and is still running, but it did
complete a full buildworld on what is r277300 before it was committed so
no reason to think it wont complete.

My laptop is running

% uname -a (with some editing)
FreeBSD 11.0-CURRENT r275646: Tue Dec  9 12:23:30 PST 2014

and I understand the a bit slower statement as it takes 5+ hours
to buildworld on my laptop.

Note sure if it matters, but I'm building i386 not amd64.
I just replaced my make.conf and src.conf with the ones you posted and 
am retested and again the build completes.


tinderbox being based off universe just with error reporting so tested 
buildworld and buildkernel for all arch's so I can't see i386 being an 
issue either, but I'm testing now with TARGET=i386 just be be sure.


Could you verify you don't have something stale or a bad checkout?


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: asr(4) error with new clang/llvm

2015-01-01 Thread Steven Hartland


On 02/01/2015 01:23, Bjoern A. Zeeb wrote:

Hi,

you need the next line of source to see that while the union only defines 
Simple[1], the comparison goes up to SG_LIST (or something) which is indeed 
defined as 58.   Cn someone fix this?   This makes i386 compiles failing 
currently.

/scratch/tmp/bz/head.svn/sys/modules/asr/../../dev/asr/asr.c:1849:29: error: 
array index 58 is past the end of the array (which contains 1 element) 
[-Werror,-Warray-bounds]
 while ((len  0)  (sg  ((PPRIVATE_SCSI_SCB_EXECUTE_MESSAGE)
^
/scratch/tmp/bz/head.svn/sys/dev/asr/i2omsg.h:934:8: note: array 'Simple' 
declared here
I2O_SGE_SIMPLE_ELEMENT  Simple[1];
^
If that's wrong it looks like there's also a number of calls to the 
macro SG(SGL,Index,Flags,Buffer,Size)  which are also wrong as Index is 
used in the same way:

(((PI2O_SG_ELEMENT)(SGL))-u.Simple[Index]

There appears to be two calls to SG where Index is 1.

I'm afraid I have no idea what the fix would be as the entire driver is 
very voodoo like to me :(


Regards
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-12-05 Thread Steven Hartland


On 04/09/2014 09:49, Borja Marcos wrote:

On Jun 30, 2014, at 8:02 PM, John Baldwin wrote:


I think these sound fine, but I've cc'd Xin Li (delphij@) who has worked with
folks at Emulex to maintain this driver.  He is probably the best person to
review this.

Hi,

Seems 10.1 is on the pipeline now, but as far as I know none of these fixes have been 
applied to -STABLE. Any chances to do it yet? As far as I know, the oce 
driver is currently unusable in -STABLE. I managed to cause a panic reliably within 30 
seconds.


Was there any conclusion to this, current and releng/10.0  releng/10.1 
seem pretty similar with regards oce but a customer is reporting panics 
very similar to this thread.


Did the commit of the additional locking never make it in?

Regards
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-12-05 Thread Steven Hartland


On 05/12/2014 13:07, Borja Marcos wrote:

On Dec 5, 2014, at 2:00 PM, Steven Hartland wrote:


On 04/09/2014 09:49, Borja Marcos wrote:

On Jun 30, 2014, at 8:02 PM, John Baldwin wrote:


I think these sound fine, but I've cc'd Xin Li (delphij@) who has worked with
folks at Emulex to maintain this driver.  He is probably the best person to
review this.

Hi,

Seems 10.1 is on the pipeline now, but as far as I know none of these fixes have been 
applied to -STABLE. Any chances to do it yet? As far as I know, the oce 
driver is currently unusable in -STABLE. I managed to cause a panic reliably within 30 
seconds.

Was there any conclusion to this, current and releng/10.0  releng/10.1 seem 
pretty similar with regards oce but a customer is reporting panics very similar to 
this thread.

Did the commit of the additional locking never make it in?

Not as far as I know. I´ve updated a couple of machines here to 10-STABLE and 
I've been applying the patch manually myself.

I don't think it's been applied even to -HEAD.

For now I've told my coworkers to avoid Emulex cards whenever possible. As far 
as  I know the driver is unusable in its present state.

Thanks for the quick reply Borja, review of the patch is now up:
/https://reviews.freebsd.org/D1269

Hopefully we can get this in the tree and make oce usable moving forward.

Regards
Steve
/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Problem with r274819? asr_timeout() not found

2014-11-22 Thread Steven Hartland

Fixed, sorry forgot asr wasn't in GENERIC, so missed it in testing.

On 22/11/2014 14:34, David Wolfskill wrote:

Running:
FreeBSD g1-253.catwhisker.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1434  
r274790M/274790:1100047: Fri Nov 21 06:07:24 PST 2014 
r...@g1-253.catwhisker.org:/common/S4/obj/usr/src/sys/CANARY  i386


Updated sources to r274845; make buildworld is OK, but make buildkernel:

...

stage 3.2: building everything

...
=== asr (all)
--- asr.o ---
--- all_subdir_asmc ---
ctfconvert -L VERSION -g asmc.o
--- all_subdir_asr ---
clang -O2 -pipe  -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc  
 -DHAVE_KERNEL_OPTION_HEADERS -include 
/common/S4/obj/usr/src/sys/GENERIC/opt_global.h -I. -I/usr/src/sys 
-I/usr/src/sys/contrib/altq -fno-common -g -I/common/S4/obj/usr/src/sys/GENERIC 
 -mno-mmx -mno-sse -msoft-float -ffreestanding -fstack-protector -gdwarf-2 
-mno-aes -mno-avx -Qunused-arguments -std=iso9899:1999 -fstack-protector -Wall 
-Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
-Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign 
-fformat-extensions  -Wmissing-include-dirs -fdiagnostics-show-option  
-Wno-error-tautological-compare -Wno-error-empty-body  
-Wno-error-parentheses-equality -Wno-error-unused-function -Wno-array-bounds  
-mno-aes -mno-avx -Qunused-arguments -c 
/usr/src/sys/modules/asr/../../dev/asr/asr.c
--- all_subdir_asmc ---
--- asmc.kld ---
ld -d -warn-common -r -d -o asmc.kld asmc.o
ctfmerge -L VERSION -g -o asmc.kld asmc.o
: export_syms
awk -f /usr/src/sys/conf/kmod_syms.awk asmc.kld  export_syms | xargs -J% 
objcopy % asmc.kld
--- asmc.ko.debug ---
ld -Bshareable -d -warn-common -o asmc.ko.debug asmc.kld
--- asmc.ko.symbols ---
objcopy --only-keep-debug asmc.ko.debug asmc.ko.symbols
--- all_subdir_asr ---
/usr/src/sys/modules/asr/../../dev/asr/asr.c:393:15: error: use of undeclared 
identifier 'asr_timeout'
 ch = timeout(asr_timeout, (caddr_t)ccb,
  ^
1 error generated.
--- all_subdir_asmc ---
--- asmc.ko ---
--- all_subdir_asr ---
*** [asr.o] Error code 1

bmake: stopped in /usr/src/sys/modules/asr
1 error

bmake: stopped in /usr/src/sys/modules/asr
--- all_subdir_asmc ---
objcopy --strip-debug --add-gnu-debuglink=asmc.ko.symbols asmc.ko.debug asmc.ko
--- all_subdir_asr ---
*** [all_subdir_asr] Error code 2

bmake: stopped in /usr/src/sys/modules
--- all_subdir_asmc ---
A failure has been detected in another branch of the parallel make

bmake: stopped in /usr/src/sys/modules/asmc
*** [all_subdir_asmc] Error code 2

bmake: stopped in /usr/src/sys/modules
--- all_subdir_arcmsr ---
ctfconvert -L VERSION -g arcmsr.o
A failure has been detected in another branch of the parallel make

bmake: stopped in /usr/src/sys/modules/arcmsr
*** [all_subdir_arcmsr] Error code 2

bmake: stopped in /usr/src/sys/modules
--- all_subdir_aic7xxx ---
--- aic79xx.o ---
ctfconvert -L VERSION -g aic79xx.o
A failure has been detected in another branch of the parallel make

bmake: stopped in /usr/src/sys/modules/aic7xxx/ahd
*** [_sub.all] Error code 2

bmake: stopped in /usr/src/sys/modules/aic7xxx
1 error

bmake: stopped in /usr/src/sys/modules/aic7xxx
*** [all_subdir_aic7xxx] Error code 2

bmake: stopped in /usr/src/sys/modules
4 errors

bmake: stopped in /usr/src/sys/modules
*** [modules-all] Error code 2

bmake: stopped in /common/S4/obj/usr/src/sys/GENERIC
1 error

bmake: stopped in /common/S4/obj/usr/src/sys/GENERIC
*** [buildkernel] Error code 2

bmake: stopped in /usr/src
1 error

bmake: stopped in /usr/src
*** [buildkernel] Error code 2

make: stopped in /usr/src
1 error

make: stopped in /usr/src
freebeast(11.0-C)[3]


And r274819 did:

Index: asr.c
===
--- asr.c   (revision 274818)
+++ asr.c   (revision 274819)
@@ -386,8 +386,12 @@
STAILQ_HEAD_INITIALIZER(Asr_softc_list);
  
  static __inline void

-set_ccb_timeout_ch(union asr_ccb *ccb, struct callout_handle ch)
+set_ccb_timeout_ch(union asr_ccb *ccb)
  {
+   struct callout_handle ch;
+
+   ch = timeout(asr_timeout, (caddr_t)ccb,
+   (int)((u_int64_t)(ccb-ccb_h.timeout) * (u_int32_t)hz / 1000));
ccb-ccb_h.sim_priv.entries[0].ptr = ch.callout;
  }
  
@@ -812,8 +816,7 @@

 */
ccb-ccb_h.timeout = 6 * 60 * 1000;
}
-   set_ccb_timeout_ch(ccb, timeout(asr_timeout, (caddr_t)ccb,
- (ccb-ccb_h.timeout * hz) / 1000));
+   set_ccb_timeout_ch(ccb);
}
splx(s);
  } /* ASR_ccbAdd */
@@ -1337,9 +1340,7 @@
  cam_sim_unit(xpt_path_sim(ccb-ccb_h.path)), s);
if (ASR_reset (sc) == ENXIO) {
/* Try again later */
-   set_ccb_timeout_ch(ccb, timeout(asr_timeout,
- (caddr_t)ccb,
- (ccb-ccb_h.timeout * hz) / 1000));
+   

Re: init(8) diagnostics?

2014-11-16 Thread Steven Hartland

Its usually something like stalled IO

On 16/11/2014 18:07, Steve Kargl wrote:

In init(8), one finds under DIAGNOSISTICS

some processes would not die; ps axl advised.

So, just how is one to actually run 'ps axl advised' as
the message appears as init(8) is killing off the system?



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: r273165. ZFS ARC: possible memory leak to Inact

2014-11-05 Thread Steven Hartland


On 05/11/2014 06:15, Marcus Reid wrote:

On Tue, Nov 04, 2014 at 06:13:44PM +, Steven Hartland wrote:

On 04/11/2014 17:22, Allan Jude wrote:

snip...
Justin Gibbs and I were helping George from Voxer look at the same issue
they are having. They had ~169GB in inact, and only ~60GB being used for
ARC.

Are there any further debugging steps we can recommend to him to help
investigate this?

The various scripts attached to the ZS ARC behavior problem and fix PR
will help provide detail this.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594

I've seen it here where there's been bursts of ZFS I/O specifically
write bursts.

What happens is that ZFS will consume large amounts of space in various
UMA zones to accommodate these bursts.

If you push the vmstat -z that he provided through the arc summary
script, you'll see that this is not what is happening.  His uma stats
match up with his arc, and do not account for his inactive memory.

uma script summary:

 Totals
 oused: 5.860GB, ofree: 1.547GB, ototal: 7.407GB
 zused: 56.166GB, zfree: 3.918GB, ztotal: 60.084GB
 used: 62.026GB, free: 5.465GB, total: 67.491GB

His provided top stats:

 Mem: 19G Active, 20G Inact, 81G Wired, 59M Cache, 3308M Buf, 4918M Free
 ARC: 66G Total, 6926M MFU, 54G MRU, 8069K Anon, 899M Header, 5129M Other


The big uma buckets (zio_buf_16384 and zio_data_buf_131072, 18.002GB and
28.802GB respectively) are nearly 0% free.


Still potentially accounts for 5.4GB of your 20GB inact.

The rest could be malloc backed allocations?

Regards
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: r273165. ZFS ARC: possible memory leak to Inact

2014-11-05 Thread Steven Hartland


On 05/11/2014 09:52, Andriy Gapon wrote:

On 04/11/2014 14:55, Steven Hartland wrote:

This is likely spikes in uma zones used by ARC.

The VM doesn't ever clean uma zones unless it hits a low memory condition, which
explains why your little script helps.

Check the output of vmstat -z to confirm.

Steve,

this is nonsense :-)  You know perfectly well that UMA memory is Wired not 
Inactive.


I'll wake up in a bit honest, thanks for the slap ;-)
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: r273165. ZFS ARC: possible memory leak to Inact

2014-11-04 Thread Steven Hartland

This is likely spikes in uma zones used by ARC.

The VM doesn't ever clean uma zones unless it hits a low memory 
condition, which explains why your little script helps.


Check the output of vmstat -z to confirm.

On 04/11/2014 11:47, Dmitriy Makarov wrote:

Hi Current,

It seems like there is constant flow (leak) of memory from ARC to Inact in 
FreeBSD 11.0-CURRENT #0 r273165.

Normally, our system (FreeBSD 11.0-CURRENT #5 r260625) keeps ARC size very 
close to vfs.zfs.arc_max:

Mem: 16G Active, 324M Inact, 105G Wired, 1612M Cache, 3308M Buf, 1094M Free
ARC: 88G Total, 2100M MFU, 78G MRU, 39M Anon, 2283M Header, 6162M Other


But after an upgrade to (FreeBSD 11.0-CURRENT #0 r273165) we observe enormous 
numbers of Inact memory in the top:

Mem: 21G Active, 45G Inact, 56G Wired, 357M Cache, 3308M Buf, 1654M Free
ARC: 42G Total, 6025M MFU, 30G MRU, 30M Anon, 819M Header, 5214M Other

Funny thing is that when we manually allocate and release memory, using simple 
python script:

#!/usr/local/bin/python2.7

import sys
import time

if len(sys.argv) != 2:
 print usage: fillmem number-of-megabytes
 sys.exit()

count = int(sys.argv[1])

megabyte = (0,) * (1024 * 1024 / 8)

data = megabyte * count

as:

# ./simple_script 1

all those allocated megabyes 'migrate' from Inact to Free, and afterwards they 
are 'eaten' by ARC with no problem.
Until Inact slowly grows back to the number it was before we ran the script.

Current workaround is to periodically invoke this python script by cron.
This is an ugly workaround and we really don't like it on our production


To answer possible questions about ARC efficience:
Cache efficiency drops dramatically with every GiB pushed off the ARC.

Before upgrade:
 Cache Hit Ratio:99.38%

After upgrade:
 Cache Hit Ratio:81.95%

We believe that ARC misbehaves and we ask your assistance.


--

Some values from configs.

HW: 128GB RAM, LSI HBA controller with 36 disks (stripe of mirrors).

top output:

In /boot/loader.conf :
vm.kmem_size=110G
vfs.zfs.arc_max=90G
vfs.zfs.arc_min=42G
vfs.zfs.txg.timeout=10

---

Thanks.

Regards,
Dmitriy
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: r273165. ZFS ARC: possible memory leak to Inact

2014-11-04 Thread Steven Hartland


On 04/11/2014 17:22, Allan Jude wrote:

snip...
Justin Gibbs and I were helping George from Voxer look at the same issue
they are having. They had ~169GB in inact, and only ~60GB being used for
ARC.

Are there any further debugging steps we can recommend to him to help
investigate this?
The various scripts attached to the ZS ARC behavior problem and fix PR 
will help provide detail this.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594

I've seen it here where there's been bursts of ZFS I/O specifically 
write bursts.


What happens is that ZFS will consume large amounts of space in various 
UMA zones to accommodate these bursts.


The VM only triggers UMA reclaim when it sees pressure, however if the 
main memory consumer is ZFS ARC its possible that the require pressure 
will not be applied because when allocating ARC ZFS takes into account 
free memory.


The result is it will back off its memory requirements before the 
reclaim is triggered leaving all the space allocated but not used.


I was playing around with a patch, on that bug report, which added clear 
down of UMA within ZFS ARC to avoid just this behavior, but its very 
much me playing for testing the theory only.


From what I've seen UMA needs something like the coloring which can be 
used to trigger clear down over time to prevent UMA zones sitting their 
eating large amounts of memory like they currently do.


Regards
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: r273165. ZFS ARC: possible memory leak to Inact

2014-11-04 Thread Steven Hartland


On 04/11/2014 17:57, Ben Perrault wrote:

snip...

I would also be interested in any additional debugging steps and would be 
willing to help test in any way I can - as I've seen the behavior a few times 
as well. As recently a Sunday evening, I caught a system running with ~44GB ARC 
but ~117GB inactive.


You should find the UMA summary script quite helpful in this regard:
https://bugs.freebsd.org/bugzilla/attachment.cgi?id=147754
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: Standalone kernel debug files moving out of /boot/kernel/

2014-10-30 Thread Steven Hartland


On 30/10/2014 08:24, O'Connor, Daniel wrote:

On 30 Oct 2014, at 13:23, Steven Hartland kill...@multiplay.co.uk wrote:

Making things harder to manage vs saving a little bit of space on the
root partition really doesn't sound like a good idea; especially when
with the ZFS install, which I would suggest is becoming the norm, the
root partition doesn't suffer from space issues anyway.

Note that it’s not “a little bit” of space.
[freebsd10 8:21] /boot/kernel ll kernel *.ko| awk '{i += $5} END {print $5}'
49312
[freebsd10 8:21] /boot/kernel ll *.symbols | awk '{i += $5} END {print $5}’
212464

i.e. the debug information is more than 4x larger than the code its for (!).

That's still a trivial about of space in the grand scheme of things.

I agree managing the symbol files does become significantly more difficult in this 
case but the patch makes quite a substantial difference to the number of kernels 
you can keep in / (especially on older installs which have 1GB roots).

The better solution is to not use a 1GB root.

Perhaps there could be a flag to disable it just for the kernel that could be 
put into /etc/make.conf? That way it’s set and forget if you are kernel 
juggling.
Making it a none default option which can be used by those who have got 
limited space on their root.


Regards
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: Standalone kernel debug files moving out of /boot/kernel/

2014-10-30 Thread Steven Hartland


On 30/10/2014 09:47, O'Connor, Daniel wrote:

On 30 Oct 2014, at 19:44, Steven Hartland kill...@multiplay.co.uk wrote:

On 30/10/2014 08:24, O'Connor, Daniel wrote:

On 30 Oct 2014, at 13:23, Steven Hartland kill...@multiplay.co.uk wrote:

Making things harder to manage vs saving a little bit of space on the
root partition really doesn't sound like a good idea; especially when
with the ZFS install, which I would suggest is becoming the norm, the
root partition doesn't suffer from space issues anyway.

Note that it’s not “a little bit” of space.
[freebsd10 8:21] /boot/kernel ll kernel *.ko| awk '{i += $5} END {print $5}'
49312
[freebsd10 8:21] /boot/kernel ll *.symbols | awk '{i += $5} END {print $5}’
212464

i.e. the debug information is more than 4x larger than the code its for (!).

That's still a trivial about of space in the grand scheme of things.

Yes.


I agree managing the symbol files does become significantly more difficult in this 
case but the patch makes quite a substantial difference to the number of kernels 
you can keep in / (especially on older installs which have 1GB roots).

The better solution is to not use a 1GB root.

Unfortunately once you install it’s impossible to expand. There are quite a few 
older systems that have been upgraded with relatively small root partitions.
I would suggest we treat those as legacy systems and look to improve the 
layout moving forward, instead of applying changes which make it more 
difficult to maintain for everyone.

Perhaps there could be a flag to disable it just for the kernel that could be 
put into /etc/make.conf? That way it’s set and forget if you are kernel 
juggling.

Making it a none default option which can be used by those who have got
limited space on their root.

Perhaps, but the defaults have been for quite small root partitions for a long 
time so I expect there are a lot of systems with a small root.
These systems are working fine though are they not? They may not be able 
to have loads of kernels installed but if you want to do that then its 
worth the pain of the reinstall instead of penalizing everyone.


Regards
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: Standalone kernel debug files moving out of /boot/kernel/

2014-10-30 Thread Steven Hartland


On 30/10/2014 12:10, Bjoern A. Zeeb wrote:

On 30 Oct 2014, at 09:47 , O'Connor, Daniel Daniel.O'con...@emc.com wrote:


On 30 Oct 2014, at 19:44, Steven Hartland kill...@multiplay.co.uk wrote:

On 30/10/2014 08:24, O'Connor, Daniel wrote:

On 30 Oct 2014, at 13:23, Steven Hartland kill...@multiplay.co.uk wrote:

Making things harder to manage vs saving a little bit of space on the
root partition really doesn't sound like a good idea; especially when
with the ZFS install, which I would suggest is becoming the norm, the
root partition doesn't suffer from space issues anyway.

Note that it’s not “a little bit” of space.
[freebsd10 8:21] /boot/kernel ll kernel *.ko| awk '{i += $5} END {print $5}'
49312
[freebsd10 8:21] /boot/kernel ll *.symbols | awk '{i += $5} END {print $5}’
212464

i.e. the debug information is more than 4x larger than the code its for (!).

That's still a trivial about of space in the grand scheme of things.

Yes.

No it is not a trivial amount of space;  it’s about 20ish% of the installation 
of the base system.
It may be a decent percentage but when that's a percentage of a small 
number its still results in a trivial amount of space in the grand 
scheme of things, which was my point ;-)

I guess I am one source for the request to get them out of 
/boot/kernel/*.symbols into a spearate tarball and not install them by default 
for users.
The thing to keep in mind, if we don't have symbols installed is how we 
create useful panic information.

The reasons for me are indeed manyfold:

(a) symbol files are for developers.  Developers are clever people, often with 
custom systems, they know how to deal with them as they already do whatever 
they want anyway in 7 different ways.
Yes and no, generating useful information from a users panic issue is 
something we really need to ensure is still possible. As where they 
aren't the developer, they are the source of the information.


So maybe some sort of post processing utility?

(b) A user should usually never have to bother with them, so why have them 
installed in first place?  They go onto (release) the media so that developers 
have access to them, or that users, if guided by a developer, can install them 
to debug a problem.  Moving them to a separate directory and into a different 
tarball makes that a lot easier.

Assuming the above is solved yes.

(c) We have people deploying gazillions of FreeBSD systems from default media, 
this stuff does add up;  10TB sounds small unless you have to back it up 
regularly, need disaster copies, or you need to minimise recovery or migration 
time, when every s is worth a few $$s.
Are you suggesting you only backup your root partition and not your usr 
partition, as if so its a null argument as the total size is still the same.


I'm assuming not and your saying we shouldn't install debug at all, 
which has the above side effect.

(d) A couple of times a year I do spare VM image backup and recovery including 
migration.  Moving the data back and forth can only happen in a very limited 
time frame, so I try to keep these VM images as small as possible.  rm -f 
/boot/kernel/*.symbol after install is really not keeping the sparseness as no 
one really supports TRIM on the VM images. Thus I’d just have lost 200MB * n 
VMs that I need to move around over 200ms RTT.  There are cruel workarounds; I 
know how to apply them as I am a developer, but if I do this stuff, there must 
be 1000s of users doing that, and they don’t.
Sounds like having a way to not install symbols to the root partition 
for *binary* installs is the real requirement?


So having an alternative location off root would be solution.

(e) People still have / (boot) filesystems in the 256M-1G space for the 
foreseeable future.  How do you ever cramp two kernels in there during an 
update for a machine that you will never ever see again because it’s in 
Antarctica on a 9600 baud connection?
While I appreciate such systems exist surely given your previous point 
where you actually don't want them installed at all, so providing that 
option instead of moving them to a different location can causing a load 
more issues is a better solution is it not?

(f) …

Yes I can understand that on these 40TB ZFS machines, no one gives a damn about 
200MB, but unfortunately not the entire world runs on these kinds of machines.  
 We have plenty of users on 10 year old hardware still running a FreeBSD 
installation and it works perfectly fine and does the job (until the HW dies;-).


The entire cp -pR kernel kernel.good solution is nothing I’d expect a user to 
ever do.  But I am aware that’s a “developer standard”.  Maybe we just need to 
improve the situation for ourselves rather than pessimising 98% of users out 
there.

Indeed.

I personally do not mind where the symbol files will end up as long as they are 
in their own directory and not onto systems by default with releases unless 
someone ticks a box.  Whether that is /boot/kernel/symbols/* or /usr

Re: HEADS UP: Standalone kernel debug files moving out of /boot/kernel/

2014-10-30 Thread Steven Hartland


On 30/10/2014 14:15, Ed Maste wrote:

On 30 October 2014 09:21, Steven Hartland kill...@multiplay.co.uk wrote:

On 30/10/2014 12:10, Bjoern A. Zeeb wrote:

(a) symbol files are for developers.  Developers are clever people, often
with custom systems, they know how to deal with them as they already do
whatever they want anyway in 7 different ways.

Yes and no, generating useful information from a users panic issue is
something we really need to ensure is still possible. As where they aren't
the developer, they are the source of the information.

So maybe some sort of post processing utility?

We're also going to make debug data for userland (libraries and
binaries) available. On my system there's about 360MB of kernel debug
and 1.5GB of userland debug, and we definitely don't want to
unconditionally install all of that. Thus we're going to have to
provide the capability of installing debug data at install time or
later anyway,

We already have some limited post-processing involved in kernel crash
handling - /etc/rc.d/savecore to pull the crash out of the swap/dump
partition, and crashinfo to extract useful information using kgdb.
There are many useful improvements we could make in kernel crash
handling, including having the process support on-demand fetching of
the kernel debug data.
Yer that's the process that was in my head, if debug symbols aren't 
available when savecore runs we're going to need a way to update / rerun 
when they are available, or even better give it the ability to do the 
same job with remote symbols?

Sounds like having a way to not install symbols to the root partition for
*binary* installs is the real requirement?
That is a requirement, yes.

Moving the debug data to a separate partition also opens up some
compelling use cases for large scale deployments, where multiple
systems run the same release. The machines can run from their own
install on disk, but have the infrequently-used debug data NFS mounted
from a common location. The / and /boot partitions may be mounted
read-only.

Sound like a good idea :)



The entire cp -pR kernel kernel.good solution is nothing I’d expect a user
to ever do.  But I am aware that’s a “developer standard”.  Maybe we just
need to improve the situation for ourselves rather than pessimising 98% of
users out there.

Indeed.

...

I think overall there's options to move forward, we just need to ensure its
not at the expense of usability for those that do have space.

Setting DEBUGDIR= in /etc/src.conf will retain the current behaviour
of installing the debug data beside the kernel and modules (and
userland binaries and libraries). Does this adequately address your
use case?

Yep that works :)



Whether that is /boot/kernel/symbols/*
or /usr/lib/***, I couldn’t care less

Note that if they go in /boot/kernel/symbols/ then we have to teach
GDB, LLDB, and other tools to look there; if they go in /usr/lib/debug
they're found automatically by the debuggers.

We may have to add standalone debug path support to other tools, but
it's very little additional work.
One thing to check would be to ensure that /usr is mounted when savecore 
runs.


Regards
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: HEADS UP: Standalone kernel debug files moving out of /boot/kernel/

2014-10-30 Thread Steven Hartland


On 30/10/2014 14:40, Ed Maste wrote:

On 30 October 2014 10:26, Steven Hartland kill...@multiplay.co.uk wrote:

Yer that's the process that was in my head, if debug symbols aren't
available when savecore runs we're going to need a way to update / rerun
when they are available, or even better give it the ability to do the same
job with remote symbols?

Yeah, remote symbol support will be excellent (eventually).

Crashinfo already operates on the most recent dump by default, so the
solution could be as simple as adding a flag to fetch debug files if
not already installed. The user would only need to run crashinfo -f to
regenerate the crash information with debug data available (and we
could mention that explicitly in the crash report).


Setting DEBUGDIR= in /etc/src.conf will retain the current behaviour
of installing the debug data beside the kernel and modules (and
userland binaries and libraries). Does this adequately address your
use case?

Yep that works :)

Great.

I've been pondering this for so long that I may have forgotten not
everyone has the same context.


One thing to check would be to ensure that /usr is mounted when savecore
runs.

Indeed, but we're covered there: the crash info is generated by
/usr/sbin/crashinfo, which relies on /usr/bin/gdb, so it better be
mounted :)
Fantastic, thanks for taking the time to address my concerns, much 
appreciated :D

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: Standalone kernel debug files moving out of /boot/kernel/

2014-10-29 Thread Steven Hartland
Hmm not sure I like this idea as it would make it more difficult to make 
a copy / backup a kernel.


ATM when I want to copy a kernel for debugging its a one liner, 
splitting debug symbols off to /usr/lib would prevent this.


Is there not a way to allow separate install of the debug files but to 
the same location maintaining compartmentalization for the needed kernel 
files?


On 29/10/2014 00:20, Ed Maste wrote:

I am preparing to move the standalone kernel debug data out of
/boot/kernel/ into /usr/lib/debug/boot/kernel/, mirroring the approach
used for userland debug data. This significantly reduces the boot
partition size requirement, and is a step towards supporting the
installation of kernel debug data ony when required. LLDB and GDB
automatically search for debug data under /usr/lib/debug/ so this
change should be transparent from an end-user perspective.

The change can be reviewed in Phabricator at
https://reviews.freebsd.org/D1006 and can be fetched as a unified diff
from https://people.freebsd.org/~emaste/patches/D1006.diff

This does not change any defaults or knobs: kernel debug files are
still built by default, and may be disabled by setting
WITHOUT_KERNEL_SYMBOLS=YES in /etc/src.conf. I hope to rationalize
this with userland debug in a later step.

Note that the change renames the intermediate and debug data files to
be consistent with userland debug data: in the build directory the
kernel with debug data included is now named kernel.full, and and
kernel.debug is the standalone debug data file.

I plan to merge this in a few days if there are no issues reported in
further review or testing.

-Ed
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: Standalone kernel debug files moving out of /boot/kernel/

2014-10-29 Thread Steven Hartland


On 30/10/2014 02:32, Steve Kargl wrote:

On Wed, Oct 29, 2014 at 03:15:50PM -0400, Ed Maste wrote:

On 29 October 2014 12:49, Steven Hartland kill...@multiplay.co.uk wrote:

Hmm not sure I like this idea as it would make it more difficult to make a
copy / backup a kernel.

ATM when I want to copy a kernel for debugging its a one liner, splitting
debug symbols off to /usr/lib would prevent this.

To retain the current behaviour you can set DEBUGDIR= (i.e., empty),
as the debug file install path is ${DESTDIR}${DEBUGDIR}${KODIR}.

No, you can't.

su root
cp -pR /boot/kernel /boot/good

Where does DEBUGDIR enter the picture?  The above will copy
both kernel and kernel.symbol to /boot/good.  With your scheme
one loses kernel.symbol (along with all other *.symbol files?).
If one escapes to the boot prompt, she can do 'boot /boot/good/kernel',
will the boot process automatically find a (nonexistant?)
/usr/lib/boot/good/kernel.symbol.
Indeed, if my understanding of this proposal is correct it will make 
working with multiple kernels much harder.


Making things harder to manage vs saving a little bit of space on the 
root partition really doesn't sound like a good idea; especially when 
with the ZFS install, which I would suggest is becoming the norm, the 
root partition doesn't suffer from space issues anyway.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: OOM killer and kernel cache reclamation rate limit in vm_pageout_scan()

2014-10-16 Thread Steven Hartland

Unfortunately ZFS doesn't prevent new inflight writes until it
hits zfs_dirty_data_max, so while what your suggesting will
help, if the writes come in quick enough I would expect it to
still be able to out run the pageout.

- Original Message - 
From: Justin T. Gibbs gi...@freebsd.org

To: freebsd-current@freebsd.org
Cc: a...@freebsd.org; Andriy Gapon a...@freebsd.org
Sent: Thursday, October 16, 2014 6:56 AM
Subject: OOM killer and kernel cache reclamation rate limit in vm_pageout_scan()


avg pointed out the rate limiting code in vm_pageout_scan() during discussion about PR 187594.  While it certainly can contribute to 
the problems discussed in that PR, a bigger problem is that it can allow the OOM killer to be triggered even though there is plenty 
of reclaimable memory available in the system.  Any load that can consume enough pages within the polling interval to hit the 
v_free_min threshold (e.g. multiple 'dd if=/dev/zero of=/file/on/zfs') can make this happen.


The product I’m working on does not have swap configured and treats any OOM trigger as fatal, so it is very obvious when this 
happens. :-)


I’ve tried several things to mitigate the problem.  The first was to ignore rate limiting for pass 2.  However, even though ZFS is 
guaranteed to receive some feedback prior to OOM being declared, my testing showed that a trivial load (a couple dd operations) 
could still consume enough of the reclaimed space to leave the system below its target at the end of pass 2.  After removing the 
rate limiting entirely, I’ve so far been unable to kill the system via a ZFS induced load.


I understand the motivation behind the rate limiting, but the current implementation seems too simplistic to be safe.  The 
documentation for the Solaris slab allocator provides good motivation for their approach of using a “sliding average” to reign in 
temporary bursts of usage without unduly harming efficient service for the recorded steady-state memory demand.  Regardless of the 
approach taken, I believe that the OOM killer must be a last resort and shouldn’t be called when there are caches that can be 
culled.


One other thing I’ve noticed in my testing with ZFS is that it needs feedback and a little time to react to memory pressure. 
Calling it’s lowmem handler just once isn’t enough for it to limit in-flight writes so it can avoid reuse of pages that it just 
freed up.  But, it doesn’t take too long to react ( 1sec in the profiling I’ve done).  Is there a way in vm_pageout_scan() that we 
can better record that progress is being made (pages were freed in the pass, even if some/all of them were consumed again) and allow 
more passes before the OOM killer is invoked in this case?


—
Justin

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: zfs hang

2014-10-09 Thread Steven Hartland


- Original Message - 
From: Steve Wills swi...@freebsd.org

To: Andriy Gapon a...@freebsd.org
Cc: curr...@freebsd.org; f...@freebsd.org
Sent: Friday, October 10, 2014 2:27 AM
Subject: Re: zfs hang



On Wed, Oct 08, 2014 at 08:55:26AM +0300, Andriy Gapon wrote:

On 08/10/2014 03:40, Steve Wills wrote:
 Hi,
 
 Not sure which thread this belongs to, but I have a zfs hang on one of my boxes

 running r272152. Running procstat -kka looks like:
 
 http://pastebin.com/szZZP8Tf
 
 My zpool commands seem to be hung in spa_errlog_lock while others are hung in

 zfs_lookup. Suggestions?

There are several threads in zio_wait.  If this is their permanent state then
there is some problem with I/O somewhere below ZFS.


Thanks for the feedback. It seems one of my disks is dying, I rebooted and it
came up OK, but today I got:

 panic: I/O to pool 'rpool' appears to be hung on vdev guid . at 
'/dev/ada0p3'

I have screenshots and backtrace if anyone is interested. Dying drives
shouldn't cause panic, right?


Its the deadman timer kicking in so yes, thats expected.

The following sysctls control this behaviour if you want to try and recover:
vfs.zfs.deadman_synctime_ms: 100
vfs.zfs.deadman_checktime_ms: 5000
vfs.zfs.deadman_enabled: 1

   Regards
   Steve


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Looping during boot-up process in FreeBSD-11 current

2014-09-28 Thread Steven Hartland

The only recent ATAPI change I recall is 270327, does it still occur
if you revert that?

   Regards
   Steve

- Original Message - 
From: Mike. the.li...@mgm51.com



I'm starting to look at FreeBSD 11-current to see what's coming soon.
I have an older notebook that I use for test environments for
purposes such as this.  Unfortunately, the notebook won't boot up
from the install CD, there's a loop it cannot seem to get out of.  




Details are:

- The install CD was made from this image:
  FreeBSD-11.0-CURRENT-i386-20140918-r271779-disc1

- The dmesg for the notebook is at the end of this message.  The
dmesg was captured with FreeBSD 10.0.  In the dmesg, you can see the
following lines:

(aprobe0:ata1:0:1:0): ATAPI_IDENTIFY. ACB: a1 00 00 00 00 40 00 00 00
00 00 00
(aprobe0:ata1:0:1:0): CAM status: Command timeout
(aprobe0:ata1:0:1:0): Error 5, Retry was blocked
run_interrupt_driven_hooks: still waiting after 60 seconds for
xpt_config
(aprobe0:ata1:0:1:0): ATAPI_IDENTIFY. ACB: a1 00 00 00 00 40 00 00 00
00 00 00
(aprobe0:ata1:0:1:0): CAM status: Command timeout
(aprobe0:ata1:0:1:0): Error 5, Retry was blocked


which, while slowing down the boot process drastically, still allowed
the boot process to run to successful completion.


- When I try to boot using the FreeBSD 11-current install CD, that
loop seems to go on ad infinitum, or at least for the 5 minutes until
I gave up.   I cannot post a dmesg from that boot-up because I never
got to a prompt.  However, I did take a couple of pictures of the
offending screens.  They are here:
http://archive.mgm51.com/cache/fbsd-11-current-01.jpg
http://archive.mgm51.com/cache/fbsd-11-current-02.jpg
The first image shows the start of the looping, and the second shows
the continuation.


While this notebook is used only for testing, it is important to me
in that aspect.  How can I get around this looping issue?

Please let me know if there's any additional info you need.

Thanks.





And now, the dmesg...

Copyright (c) 1992-2014 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.0-RELEASE-p8 #1 r271323: Wed Sep 10 20:25:45 EDT 2014
   r...@a31pf.245l.home:/usr/obj/usr/src/sys/GENERIC i386
FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
CPU: Intel(R) Pentium(R) 4 Mobile CPU 1.70GHz (1698.60-MHz 686-class
CPU)
 Origin = GenuineIntel  Id = 0xf24  Family = 0xf  Model = 0x2
Stepping = 4
 Features=0x3febf9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,
MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM
real memory  = 1073741824 (1024 MB)
avail memory = 1029230592 (981 MB)
kbd1 at kbdmux0
random: Software, Yarrow initialized
acpi0: IBM TP-1G on motherboard
acpi_ec0: Embedded Controller: GPE 0x1c, ECDT port 0x62,0x66 on
acpi0
acpi0: Power Button (fixed)
acpi0: reservation of 0, a (3) failed
acpi0: reservation of 10, 3ff0 (3) failed
cpu0: ACPI CPU on acpi0
attimer0: AT timer port 0x40-0x43 irq 0 on acpi0
Timecounter i8254 frequency 1193182 Hz quality 0
Event timer i8254 frequency 1193182 Hz quality 100
atrtc0: AT realtime clock port 0x70-0x71 irq 8 on acpi0
Event timer RTC frequency 32768 Hz quality 0
Timecounter ACPI-safe frequency 3579545 Hz quality 850
acpi_timer0: 24-bit timer at 3.579545MHz port 0x1008-0x100b on
acpi0
acpi_lid0: Control Method Lid Switch on acpi0
acpi_button0: Sleep Button on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
agp0: Intel 82845 host to AGP bridge on hostb0
pcib1: ACPI PCI-PCI bridge at device 1.0 on pci0
pci1: ACPI PCI bus on pcib1
vgapci0: VGA-compatible display port 0x3000-0x30ff mem
0xe800-0xefff,0xd010-0xd010 irq 11 at device 0.0 on
pci1
vgapci0: Boot video device
uhci0: Intel 82801CA/CAM (ICH3) USB controller USB-A port
0x1800-0x181f irq 11 at device 29.0 on pci0
usbus0 on uhci0
uhci1: Intel 82801CA/CAM (ICH3) USB controller USB-B port
0x1820-0x183f irq 11 at device 29.1 on pci0
usbus1 on uhci1
uhci2: Intel 82801CA/CAM (ICH3) USB controller USB-C port
0x1840-0x185f irq 11 at device 29.2 on pci0
usbus2 on uhci2
pcib2: ACPI PCI-PCI bridge at device 30.0 on pci0
pci2: ACPI PCI bus on pcib2
cbb0: RF5C476 PCI-CardBus Bridge mem 0x5000-0x5fff irq 11
at device 0.0 on pci2
cardbus0: CardBus bus on cbb0
pccard0: 16-bit PCCard bus on cbb0
cbb1: RF5C476 PCI-CardBus Bridge mem 0x5010-0x50100fff irq 11
at device 0.1 on pci2
cardbus1: CardBus bus on cbb1
pccard1: 16-bit PCCard bus on cbb1
pci2: serial bus, FireWire at device 0.2 (no driver attached)
fxp0: Intel 82801CAM (ICH3) Pro/100 VE Ethernet port 0x8000-0x803f
mem 0xd020-0xd0200fff irq 11 at device 8.0 on pci2
miibus0: MII bus on fxp0
inphy0: i82562ET 10/100 media interface PHY 1 on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto,
auto-flow
fxp0: Ethernet address: 

Re: Looping during boot-up process in FreeBSD-11 current

2014-09-28 Thread Steven Hartland

You'll only need a new kernel and if you cut down to modules / drivers
you need then that shouldn't take too long.

   Regards
   Steve

- Original Message - 
From: Mike. the.li...@mgm51.com

To: freebsd-current@freebsd.org
Sent: Sunday, September 28, 2014 5:43 PM
Subject: Re: Looping during boot-up process in FreeBSD-11 current





On 9/28/2014 at 5:01 PM Steven Hartland wrote:

|The only recent ATAPI change I recall is 270327, does it still occur
|if you revert that?
|
=


OK, I'll download the 11-current source.

Then revert 270327
https://svnweb.freebsd.org/base/head/sys/cam/ata/ata_xpt.c?r1=270327;
r2=270326pathrev=270327

Recompile the system

And try to boot.


I'll post the results in a couple of days (full system compiles take
a while on this notebook).

(unless someone has a 11-current ISO snapshot from before 270327?)





___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: What do you use for kernel debugging?

2014-09-28 Thread Steven Hartland
- Original Message - 
From: Benjamin Kaduk ka...@mit.edu

To: José Pérez Arauzo f...@aoek.com
Cc: FreeBSD Current freebsd-current@freebsd.org
Sent: Sunday, September 28, 2014 8:54 PM
Subject: Re: What do you use for kernel debugging?



On Sun, 28 Sep 2014, José Pérez Arauzo wrote:


Hello,
I am trying to track down a (deadlock?) issue in CURRENT via DDB. The kernel 
does
not complete hw probes on my Acer V5.

I get stuck on apic_isr looping which leads nowhere.

So I thought maybe things improve if I debug from another machine.


What do you use for kernel debugging? According to the handbook kgdb over serial
is a good option, do you agree? I'm on a netbook with no ethernet and no option
for firewire: can I have a USB / nullmodem setup to work?


You cannot.


I have no old-style uarts hardware anymore, as the handbook suggests...

Any idea is welcome before I buy extra hw. I have a USB to serial showing up as
/dev/cuaU0, do I need to grab another one and a nullmodem cable or there are 
better
alternatives? Thank you.


I'm not sure that there are alternatives at all, unfortunately.

You may be reduced to debugging-via-printf.


dtrace can also be quite invaluable.

   Regards
   Steve 


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: zpool frag

2014-09-21 Thread Steven Hartland

Backup the pool and restore it is the only way I'm aware of.
- Original Message - 
From: Beeblebrox zap...@berentweb.com

To: freebsd-current@freebsd.org
Sent: Sunday, September 21, 2014 9:57 AM
Subject: zpool frag



FRAG means fragmentation, right? Zpool fragmentation? That's news to me. If
this is real how do I fix it?

NAME  SIZE  ALLOC   FREE   FRAG  EXPANDSZCAP  DEDUP  HEALTH  ALTROOT
pool1  75.5G  53.7G  21.8G60% -71%  1.00x  ONLINE  -
pool2  48.8G  26.2G  22.6G68% -53%  1.00x  ONLINE  -
pool3   204G   177G  27.0G53% -86%  1.11x  ONLINE  -

Regards.



-
FreeBSD-11-current_amd64_root-on-zfs_RadeonKMS
--
View this message in context: 
http://freebsd.1045724.n5.nabble.com/zpool-frag-tp5950788.html
Sent from the freebsd-current mailing list archive at Nabble.com.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: zpool frag

2014-09-21 Thread Steven Hartland


- Original Message - 

From: Peter Wemm pe...@wemm.org
On Sunday, September 21, 2014 11:06:10 AM Allan Jude wrote:
 On 2014-09-21 04:57, Beeblebrox wrote:
  FRAG means fragmentation, right? Zpool fragmentation? That's news to me.
  If
  this is real how do I fix it?
  
  NAME  SIZE  ALLOC   FREE   FRAG  EXPANDSZCAP  DEDUP  HEALTH 
  ALTROOT pool1  75.5G  53.7G  21.8G60% -71%  1.00x 
  ONLINE  - pool2  48.8G  26.2G  22.6G68% -53%  1.00x 
  ONLINE  - pool3   204G   177G  27.0G53% -86%  1.11x 
  ONLINE  -

 It is not something you 'fix', it is just a metric to help you
 understand the performance of your pool. The higher the fragmentation,
 the longer it might take to allocate new space, and obviously you will
 have more random seek time while reading from the pool.
 
 As Steven mentions, there is no defragmentation tool for ZFS. You can

 zfs send/recv or backup/restore the pool if you have a strong enough
 reason to want to get the fragmentation number down.
 
 It is a fairly natural side effect of a copy-on-write file system.
 
 Note: the % is not the % fragmented, IIRC, it is the percentage of the

 free blocks that are less that a specific size. I forget what that size is.

I fear that the information presented in its current form is going to generate 
lots of fear and confusion.


The other thing to consider is that this gets much, much worse as the pool 
fills up.  Even UFS has issues with fragmentation when it fills, but ZFS is far 
more sensative to it.  In the freebsd.org cluster we have a health check alert 
at 80% full, but even that's probably on the high side.


This should be less of an issue if you have the spacemap_histogram feature
enabled on the pool, which IIRC if your seeing FRAG details should be the case.

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: zpool: multiple IDs, CURRENT drops all pools after reboot

2014-09-16 Thread Steven Hartland

On of my backup drives dedicated to a ZPOOL is faulting and showing up multiple 
ID. The
only working ID is id: 257822624560506537.

FreeBSD CURRENT with three ZFS disks and only 4GB of RAM is very flaky 
regarding this
issue: today, tow times the whole poolset vanishes after a reboot. Giving the 
box 8 GB
total and rebooting doens't show the problem, it gets more frequent when 
reducing the RAM
to 4GB (FreeBSD 11.0-CURRENT #2 r271684: Tue Sep 16 20:41:47 CEST 2014). This 
is a bit
spooky.

Below the faulted harddrive. I guess the drive/pool below shown triggers 
somehow the loss
of all other pools (I have to import the other pools, which do not have any 
defects, but
they they drop out after a reboot and vanish).

Is there a way getting rid of the faulty IDs without destroying the pool?

Regards,

Oliver 


 root@thor: [/etc] zpool import
   pool: BACKUP00
 id: 9337833315545958689
  state: FAULTED
 status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
   see: http://illumos.org/msg/ZFS-8000-5E
 config:

BACKUP00   FAULTED  corrupted data
  8544670861382329237  UNAVAIL  corrupted data

   pool: BACKUP00
 id: 257822624560506537
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

BACKUP00ONLINE
  ada3p1ONLINE



Might be a long shot but check out the patches on:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594

Specifically:
https://bugs.freebsd.org/bugzilla/attachment.cgi?id=147070

And if that doesn't work:
https://bugs.freebsd.org/bugzilla/attachment.cgi?id=147286

The second has all the changes from the first with the addition
of some changes which dynamically size the max dirty data.

These changes are in discussion and its likely the additions
in the second patch aren't the right direction but they
have been reported to show good improvements under high
memory pressure for certain workloads, so would be interesting
to see if they help with your problem.

All that said you shouldnt end up with corrupt data no matter
what.

Are there any other symptoms? Has memory been checked for
faults etc?

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS-related panic: possible spa-spa_errlog_lock deadlock

2014-09-07 Thread Steven Hartland


- Original Message - 
From: Xin Li delp...@delphij.net

To: Fabian Keil freebsd-lis...@fabiankeil.de; freebsd-current@freebsd.org
Cc: Alexander Motin m...@ixsystems.com
Sent: Sunday, September 07, 2014 4:56 PM
Subject: Re: ZFS-related panic: possible spa-spa_errlog_lock deadlock



-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 9/7/14 11:23 PM, Fabian Keil wrote:

Xin Li delp...@delphij.net wrote:


On 9/7/14 9:02 PM, Fabian Keil wrote:

Using a kernel built from FreeBSD 11.0-CURRENT r271182 I got
the following panic yesterday:

[...] Unread portion of the kernel message buffer: [6880]
panic: deadlkres: possible deadlock detected for
0xf80015289490, blocked for 1800503 ticks


Any chance to get all backtraces (e.g. thread apply all bt full
16)? I think a different thread that held the lock have been
blocked, probably related to your disconnected vdev.


Output of thread apply all bt full 16 is available at: 
http://www.fabiankeil.de/tmp/freebsd/kgdb-output-spa_errlog_lock-deadlock.txt


 A lot of the backtraces prematurely end with Cannot access memory
at address, therefore I also added thread apply all bt output.

Apparently there are at least two additional threads blocking below
spa_get_stats():

Thread 1182 (Thread 101989): #0  sched_switch
(td=0xf800628cc490, newtd=value optimized out, flags=value
optimized out) at /usr/src/sys/kern/sched_ule.c:1932 #1
0x805a23c1 in mi_switch (flags=260, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:493 #2  0x805e4bca in
sleepq_wait (wchan=0x0, pri=0) at
/usr/src/sys/kern/subr_sleepqueue.c:631 #3  0x80539f10 in
_cv_wait (cvp=0xf80025534a50, lock=0xf80025534a30) at
/usr/src/sys/kern/kern_condvar.c:139 #4  0x811721db in
zio_wait (zio=value optimized out) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1442 
#5  0x81102111 in dbuf_read (db=value optimized out,

zio=value optimized out, flags=value optimized out) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:649 
#6  0x81108e6d in dmu_buf_hold (os=value optimized out,

object=value optimized out, offset=value optimized out,
tag=0x0, dbp=0xfe00955c6648, flags=value optimized out) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:172 
#7  0x81163986 in zap_lockdir (os=0xf8002b7ab000,

obj=92, tx=0x0, lti=RW_READER, fatreader=1, adding=0, zapp=value
optimized out) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:467



#8  0x811644ad in zap_count (os=0x0, zapobj=0,
count=0xfe00955c66d8) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:712

#9  0x8114a6dc in spa_get_errlog_size
(spa=0xf800062ed000) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_errlog.c:149



- ---Type return to continue, or q return to quit---

#10 0x8113f549 in spa_get_stats (name=0xfe0044cac000
spaceloop, config=0xfe00955c68e8, altroot=0xfe0044cac430
, buflen=2048) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:3287 
#11 0x81189a45 in zfs_ioc_pool_stats

(zc=0xfe0044cac000) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:1656



#12 0x81187290 in zfsdev_ioctl (dev=value optimized out,
zcmd=value optimized out, arg=value optimized out, flag=value
optimized out, td=value optimized out)

at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:6136



#13 0x80464a55 in devfs_ioctl_f (fp=0xf80038bd00a0,
com=3222821381, data=0xf800067b80a0, cred=value optimized out,
td=0xf800628cc490) at /usr/src/sys/fs/devfs/devfs_vnops.c:757

#14 0x805f3c3d in kern_ioctl (td=0xf800628cc490,
fd=value optimized out, com=0) at file.h:311 #15
0x805f381c in sys_ioctl (td=0xf800628cc490,
uap=0xfe00955c6b80) at /usr/src/sys/kern/sys_generic.c:702 #16
0x8085c2db in amd64_syscall (td=0xf800628cc490,
traced=0) at subr_syscall.c:133 #17 0x8083f90b in
Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:390 #18
0x0008019fc3da in ?? () Previous frame inner to this frame
(corrupt stack?)


Yes, thread 1182 owned the lock and is waiting for the zio be done.
Other threads that wanted the lock would have to wait.

I don't have much clue why the system entered this state, however, as
the operations should have errored out (the GELI device is gone on
21:44:56 based on your log, which suggests all references were closed)
instead of waiting.

Adding mav@ as he may have some idea.


We're seen a disk drop invalidating a pool before, which should fail
all reads / writes but process have instead just wedged in the kernel.


From experience I'd say it happens ~5% of time, so its quite hard to

catch.

Unfortunately never managed to get a dump of it.

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list

Re: i386 compilation errors in head/sys/dev/ixl/if_ixl.c

2014-08-29 Thread Steven Hartland

Looks like this was already fixed by:
http://svnweb.freebsd.org/changeset/base/270799

   Regards
   Steve

- Original Message - 
From: David Shao davs...@gmail.com

To: freebsd-current@freebsd.org
Sent: Friday, August 29, 2014 6:38 AM
Subject: i386 compilation errors in head/sys/dev/ixl/if_ixl.c



Compilation errors occur in head/sys/dev/ixl/if_ixl.c on i386 for
FreeBSD 11-current for the following:

In function ixl_print_debug_info()

   printf(Queue irqs = %lx\n, que-irqs);
   printf(AdminQ irqs = %lx\n, pf-admin_irq);
...
   printf(RX not ready = %lx\n, rxr-not_done);
   printf(RX packets = %lx\n, rxr-rx_packets);

all cause
error: format specifies type 'unsigned long' but the argument has type
'u64' (aka 'unsigned long long') [-Werror,-Wformat]

In function ixl_stat_update48(struct i40e_hw *hw, u32 hireg, u32 
loreg,

   bool offset_loaded, u64 *offset, u64 *stat)


#if __FreeBSD__ = 10  __amd64__

causes
error:  '__amd64__' is not defined, evaluates to 0 [-Werror,-Wundef]
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
freebsd-current-unsubscr...@freebsd.org




___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Build failed in Jenkins: FreeBSD_HEAD #1334

2014-08-28 Thread Steven Hartland

This should already be fixed by r270758

- Original Message - 
From: jenkins-ad...@freebsd.org
To: jenkins-ad...@freebsd.org; freebsd-current@freebsd.org; 
rodr...@freebsd.org; j...@freebsd.org

Sent: Thursday, August 28, 2014 8:53 PM
Subject: Build failed in Jenkins: FreeBSD_HEAD #1334


See 
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/1334/changes


Changes:

[jfv] Add XL710 device entries to NOTES, and directories to the module
Makefile so they will be built.

MFC after: 1 day

[rodrigc] Use file -s, so that we can run vmrun.sh against special 
devices such

as /dev/md memory file systems

Reviewed by: neel

--
[...truncated 254021 lines...]
   eventhandler_tagvlan_detach;
   ^
--- all_subdir_ixl ---
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixl/../../dev/ixl/if_ixl.c:650:21: 
error: implicit declaration of function 'EVENTHANDLER_REGISTER' is 
invalid in C99 [-Werror,-Wimplicit-function-declaration]

   vsi-vlan_attach = EVENTHANDLER_REGISTER(vlan_config,
  ^
--- all_subdir_ixlv ---
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixlv/../../dev/ixl/if_ixlv.c:498:21: 
error: implicit declaration of function 'EVENTHANDLER_REGISTER' is 
invalid in C99 [-Werror,-Wimplicit-function-declaration]

   vsi-vlan_attach = EVENTHANDLER_REGISTER(vlan_config,
  ^
--- all_subdir_ixl ---
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixl/../../dev/ixl/if_ixl.c:650:43: 
error: use of undeclared identifier 'vlan_config'

   vsi-vlan_attach = EVENTHANDLER_REGISTER(vlan_config,
^
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixl/../../dev/ixl/if_ixl.c:652:43: 
error: use of undeclared identifier 'vlan_unconfig'

   vsi-vlan_detach = EVENTHANDLER_REGISTER(vlan_unconfig,
^
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixl/../../dev/ixl/if_ixl.c:670:3: 
error: implicit declaration of function 'if_free' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

   if_free(vsi-ifp);
   ^
--- all_subdir_ixlv ---
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixlv/../../dev/ixl/if_ixlv.c:498:43: 
error: use of undeclared identifier 'vlan_config'

   vsi-vlan_attach = EVENTHANDLER_REGISTER(vlan_config,
^
--- all_subdir_ixl ---
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixl/../../dev/ixl/if_ixl.c:670:3: 
note: did you mean 'm_free'?

--- all_subdir_ixlv ---
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixlv/../../dev/ixl/if_ixlv.c:500:43: 
error: use of undeclared identifier 'vlan_unconfig'

   vsi-vlan_detach = EVENTHANDLER_REGISTER(vlan_unconfig,
^
--- all_subdir_ixl ---
@/sys/mbuf.h:1138:1: note: 'm_free' declared here
m_free(struct mbuf *m)
^
--- all_subdir_ixlv ---
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixlv/../../dev/ixl/if_ixlv.c:550:14: 
error: incomplete definition of type 'struct ifnet'

   if (vsi-ifp-if_vlantrunk != NULL) {
   ^
--- all_subdir_ixl ---
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixl/../../dev/ixl/if_ixl.c:698:14: 
error: incomplete definition of type 'struct ifnet'

   if (vsi-ifp-if_vlantrunk != NULL) {
   ^
--- all_subdir_ixlv ---
@/sys/mbuf.h:123:9: note: forward declaration of 'struct ifnet'
   struct ifnet*rcvif; /* rcv interface */
  ^
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixlv/../../dev/ixl/if_ixlv.c:557:14: 
error: incomplete definition of type 'struct ifnet'

   if (vsi-ifp-if_drv_flags  IFF_DRV_RUNNING) {
   ^
--- all_subdir_ixl ---
@/sys/mbuf.h:123:9: note: forward declaration of 'struct ifnet'
   struct ifnet*rcvif; /* rcv interface */
  ^
--- all_subdir_ixlv ---
@/sys/mbuf.h:123:9: note: forward declaration of 'struct ifnet'
   struct ifnet*rcvif; /* rcv interface */
  ^
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixlv/../../dev/ixl/if_ixlv.c:566:18: 
error: incomplete definition of type 'struct ifnet'

   while (vsi-ifp-if_drv_flags  IFF_DRV_RUNNING
  ^
@/sys/mbuf.h:123:9: note: forward declaration of 'struct ifnet'
   struct ifnet*rcvif; /* rcv interface */
  ^
https://jenkins.freebsd.org/jenkins/job/FreeBSD_HEAD/ws/sys/modules/ixlv/../../dev/ixl/if_ixlv.c:578:3: 
error: implicit declaration of function 'EVENTHANDLER_DEREGISTER' is 
invalid in C99 [-Werror,-Wimplicit-function-declaration]

   EVENTHANDLER_DEREGISTER(vlan_config, 

Re: [ZFS][PANIC] Solaris Assert/zio.c:2548

2014-07-20 Thread Steven Hartland

Can you provide the details of the zio which caused the panic?

Also does any of your pools support trim?

   Regards
   Steve

- Original Message - 
From: Larry Rosenman l...@lerctr.org

To: freebsd...@freebsd.org; freebsd-current@freebsd.org
Sent: Sunday, July 20, 2014 3:03 PM
Subject: [ZFS][PANIC] Solaris Assert/zio.c:2548



Got the following panic overnight (I think while a nightly rsync was running):

Dump header from device /dev/gpt/swap0
 Architecture: amd64
 Architecture Version: 2
 Dump Length: 8122101760B (7745 MB)
 Blocksize: 512
 Dumptime: Sun Jul 20 03:22:18 2014
 Hostname: borg.lerctr.org
 Magic: FreeBSD Kernel Dump
 Version String: FreeBSD 11.0-CURRENT #50 r268894M: Sat Jul 19 18:06:08 CDT 2014
   r...@borg.lerctr.org:/usr/obj/usr/src/sys/VT-LER
 Panic String: solaris assert: !(zio-io_flags  ZIO_FLAG_DELEGATED), file: 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c, line: 2874

 Dump Parity: 763150733
 Bounds: 5
 Dump Status: good


borg.lerctr.org dumped core - see /var/crash/vmcore.5

Sun Jul 20 03:28:12 CDT 2014

FreeBSD borg.lerctr.org 11.0-CURRENT FreeBSD 11.0-CURRENT #50 r268894M: Sat Jul 19 18:06:08 CDT 2014 
r...@borg.lerctr.org:/usr/obj/usr/src/sys/VT-LER  amd64


panic: solaris assert: !(zio-io_flags  ZIO_FLAG_DELEGATED), file: 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c, line: 2874


GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...

Unread portion of the kernel message buffer:
panic: solaris assert: !(zio-io_flags  ZIO_FLAG_DELEGATED), file: 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c, line: 2874

cpuid = 7
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe100c49f930
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe100c49f9e0
vpanic() at vpanic+0x126/frame 0xfe100c49fa20
panic() at panic+0x43/frame 0xfe100c49fa80
assfail() at assfail+0x1d/frame 0xfe100c49fa90
zio_vdev_io_assess() at zio_vdev_io_assess+0x2ed/frame 0xfe100c49fac0
zio_execute() at zio_execute+0x1e9/frame 0xfe100c49fb20
taskqueue_run_locked() at taskqueue_run_locked+0xf0/frame 0xfe100c49fb80
taskqueue_thread_loop() at taskqueue_thread_loop+0x9b/frame 0xfe100c49fbb0
fork_exit() at fork_exit+0x84/frame 0xfe100c49fbf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe100c49fbf0
--- trap 0, rip = 0, rsp = 0xfe100c49fcb0, rbp = 0 ---
Uptime: 8h57m17s
(ada2:ahcich2:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
(ada2:ahcich2:0:0:0): CAM status: Command timeout
(ada2:ahcich2:0:0:0): Error 5, Retries exhausted
(ada2:ahcich2:0:0:0): Synchronize cache failed
Dumping 7745 out of 64463 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/linux.ko.symbols...done.
Loaded symbols for /boot/kernel/linux.ko.symbols
Reading symbols from /boot/kernel/if_lagg.ko.symbols...done.
Loaded symbols for /boot/kernel/if_lagg.ko.symbols
Reading symbols from /boot/kernel/snd_envy24ht.ko.symbols...done.
Loaded symbols for /boot/kernel/snd_envy24ht.ko.symbols
Reading symbols from /boot/kernel/snd_spicds.ko.symbols...done.
Loaded symbols for /boot/kernel/snd_spicds.ko.symbols
Reading symbols from /boot/kernel/coretemp.ko.symbols...done.
Loaded symbols for /boot/kernel/coretemp.ko.symbols
Reading symbols from /boot/kernel/ichsmb.ko.symbols...done.
Loaded symbols for /boot/kernel/ichsmb.ko.symbols
Reading symbols from /boot/kernel/smbus.ko.symbols...done.
Loaded symbols for /boot/kernel/smbus.ko.symbols
Reading symbols from /boot/kernel/ichwd.ko.symbols...done.
Loaded symbols for /boot/kernel/ichwd.ko.symbols
Reading symbols from /boot/kernel/cpuctl.ko.symbols...done.
Loaded symbols for /boot/kernel/cpuctl.ko.symbols
Reading symbols from /boot/kernel/crypto.ko.symbols...done.
Loaded symbols for /boot/kernel/crypto.ko.symbols
Reading symbols from /boot/kernel/cryptodev.ko.symbols...done.
Loaded symbols for /boot/kernel/cryptodev.ko.symbols
Reading symbols from /boot/kernel/dtraceall.ko.symbols...done.
Loaded symbols for /boot/kernel/dtraceall.ko.symbols
Reading symbols from /boot/kernel/profile.ko.symbols...done.
Loaded symbols for /boot/kernel/profile.ko.symbols
Reading symbols from /boot/kernel/cyclic.ko.symbols...done.
Loaded symbols for /boot/kernel/cyclic.ko.symbols
Reading symbols from /boot/kernel/dtrace.ko.symbols...done.
Loaded symbols for /boot/kernel/dtrace.ko.symbols
Reading symbols from /boot/kernel/systrace_freebsd32.ko.symbols...done.
Loaded symbols for /boot/kernel/systrace_freebsd32.ko.symbols
Reading symbols from /boot/kernel/systrace.ko.symbols...done.
Loaded symbols for /boot/kernel/systrace.ko.symbols

Re: [ZFS][PANIC] Solaris Assert/zio.c:2548

2014-07-20 Thread Steven Hartland

Something like following should allow you to get the zio details
assuming the compile has optimised it out:

cd /var/crash
kgdb /boot/kernel/kernel /var/crash/vmcore.5
kgdb frame 5
kgdb print zio

   Regards
   Steve

- Original Message - 
From: Larry Rosenman l...@lerctr.org

To: Steven Hartland kill...@multiplay.co.uk
Cc: freebsd...@freebsd.org; freebsd-current@freebsd.org
Sent: Sunday, July 20, 2014 8:20 PM
Subject: Re: [ZFS][PANIC] Solaris Assert/zio.c:2548



On 2014-07-20 14:18, Steven Hartland wrote:

Can you provide the details of the zio which caused the panic?

Also does any of your pools support trim?


No, on the trim.  Can you walk me through getting the zio you need?


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [ZFS][PANIC] Solaris Assert/zio.c:2548

2014-07-20 Thread Steven Hartland

Can you try reverting r265321 and see if you still see the
same crash?

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [ZFS][PANIC] Solaris Assert/zio.c:2548

2014-07-20 Thread Steven Hartland
- Original Message - 
From: Larry Rosenman l...@lerctr.org

To: Steven Hartland kill...@multiplay.co.uk
Cc: freebsd...@freebsd.org; freebsd-current@freebsd.org
Sent: Monday, July 21, 2014 12:22 AM
Subject: Re: [ZFS][PANIC] Solaris Assert/zio.c:2548



On 2014-07-20 18:21, Steven Hartland wrote:

Can you try reverting r265321 and see if you still see the
same crash?

   Regards
   Steve

I'll do the revert, but it's been a ONE TIME hit.

There was a followup to mine with a reproducible poudriere crash like 
mine.


If you don't have a reproducable senario I'd hold off.

Florian, is yours reproducable and can you send me
a pretty print of the crashing zio?

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [ZFS][PANIC] Solaris Assert/zio.c:2548

2014-07-20 Thread Steven Hartland


- Original Message - 
From: Dan Mack m...@macktronics.com



I think I may have hit the same problem; I'm going to stay connected
to the console and see if it happens again; this is what I see
currently with the back-trace:

db bt
Tracing pid 0 tid 100070 td 0xf8000e088920
kdb_enter() at kdb_enter+0x3e/frame 0xfe085ef1d980
vpanic() at vpanic+0x146/frame 0xfe085ef1d9c0
panic() at panic+0x43/frame 0xfe085ef1da20
deadlkres() at deadlkres+0x35c/frame 0xfe085ef1da70
fork_exit() at fork_exit+0x84/frame 0xfe085ef1dab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe085ef1dab0
--- trap 0, rip = 0, rsp = 0xfe085ef1db70, rbp = 0 ---

I just updated to I think 268921 earlier today and this is the first
time I've had a panic (HEAD-268921 that is)

I'll try to get some more data if I can get it back up and running.


That doesn't look like a related trace tbh.

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [ZFS][PANIC] Solaris Assert/zio.c:2548

2014-07-20 Thread Steven Hartland


- Original Message - 
From: Dan Mack m...@macktronics.com

To: Steven Hartland kill...@multiplay.co.uk
Cc: freebsd...@freebsd.org; freebsd-current@freebsd.org; Larry Rosenman 
l...@lerctr.org
Sent: Monday, July 21, 2014 2:29 AM
Subject: Re: [ZFS][PANIC] Solaris Assert/zio.c:2548



On Mon, 21 Jul 2014, Steven Hartland wrote:


I just updated to I think 268921 earlier today and this is the first
time I've had a panic (HEAD-268921 that is)

I'll try to get some more data if I can get it back up and running.


That doesn't look like a related trace tbh.

  Regards
  Steve


After rebooting with a dumpdev; I got this :

kbd2 at ukbd0
Trying to mount root from zfs:tank []...
panic: deadlkres: possible deadlock detected for 0xf8000e089000, blocked 
for 1801216 ticks

cpuid = 6
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe085ef1d8d0
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe085ef1d980
vpanic() at vpanic+0x126/frame 0xfe085ef1d9c0
panic() at panic+0x43/frame 0xfe085ef1da20
deadlkres() at deadlkres+0x35c/frame 0xfe085ef1da70
fork_exit() at fork_exit+0x84/frame 0xfe085ef1dab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe085ef1dab0
--- trap 0, rip = 0, rsp = 0xfe085ef1db70, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 100070 ]
Stopped at  kdb_enter+0x3e: movq$0,kdb_why

I cannot seem to get past this yet so I'm open to suggestions.  I'm
still at the db prompt if you'd like me to attempt to collect more
info.


For some reason the deadlock detector is triggering, not sure why.

I'd recommend starting a new thread to discuss this as it doesn't
appear to be related to this thread.

The only thing I could suggest is disabling it to see if it truely
is a deadlock or if something is being really slow.
vfs.zfs.deadman_enabled=0

If this is new then it would be good for you to try and identify
which of the changes introduced it, so do a binary chop on versions
back to your last known good.

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [ZFS][PANIC] Solaris Assert/zio.c:2548

2014-07-20 Thread Steven Hartland


- Original Message - 
From: Dan Mack m...@macktronics.com

To: Steven Hartland kill...@multiplay.co.uk
Cc: freebsd...@freebsd.org; freebsd-current@freebsd.org; Larry Rosenman 
l...@lerctr.org
Sent: Monday, July 21, 2014 2:29 AM
Subject: Re: [ZFS][PANIC] Solaris Assert/zio.c:2548



On Mon, 21 Jul 2014, Steven Hartland wrote:


I just updated to I think 268921 earlier today and this is the first
time I've had a panic (HEAD-268921 that is)

I'll try to get some more data if I can get it back up and running.


That doesn't look like a related trace tbh.

  Regards
  Steve


After rebooting with a dumpdev; I got this :

kbd2 at ukbd0
Trying to mount root from zfs:tank []...
panic: deadlkres: possible deadlock detected for 0xf8000e089000, blocked 
for 1801216 ticks

cpuid = 6
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe085ef1d8d0
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe085ef1d980
vpanic() at vpanic+0x126/frame 0xfe085ef1d9c0
panic() at panic+0x43/frame 0xfe085ef1da20
deadlkres() at deadlkres+0x35c/frame 0xfe085ef1da70
fork_exit() at fork_exit+0x84/frame 0xfe085ef1dab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe085ef1dab0
--- trap 0, rip = 0, rsp = 0xfe085ef1db70, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 100070 ]
Stopped at  kdb_enter+0x3e: movq$0,kdb_why

I cannot seem to get past this yet so I'm open to suggestions.  I'm
still at the db prompt if you'd like me to attempt to collect more
info.


Just spotted an interesting message on a recent commit which may be
relavent:


URL: http://svnweb.freebsd.org/changeset/base/268855
This specific commit makes boot hang just before mounting the root 
dataset for me when vfs.zfs.vdev.cache.size tunable is set. Unsetting 
this tunable or reverting this commit (currently running r268933 minus 
r268855) fixes the boot for me.


Please let me know if I can provide any more information.

- Nikolai Lifanov


The current code disables vdev caching by default so this will only
occur if manually enabled.

The code details the reason for this as:-
* TODO: Note that with the current ZFS code, it turns out that the
* vdev cache is not helpful, and in some cases actually harmful.  It
* is better if we disable this.  Once some time has passed, we should
* actually remove this to simplify the code.  For now we just disable
* it by setting the zfs_vdev_cache_size to zero.  Note that Solaris 11
* has made these same changes.

   Regards
   Steve

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS ARC max sort of not working?

2014-07-04 Thread Steven Hartland

Related to the global sysctl changes recently perhaps?
- Original Message - 
From: Sean Bruno sbr...@ignoranthack.me

To: freebsd-current freebsd-current@freebsd.org
Sent: Friday, July 04, 2014 4:56 PM
Subject: ZFS ARC max sort of not working?



It looks like the following no longer works on head?

vfs.zfs.arc_max=8G

But this does?

vfs.zfs.arc_max=8589934592

sean

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: 11.0-CURRENT #1 r267422: OpenLDAP fails to startup out of the blue after buildworld

2014-06-12 Thread Steven Hartland

I've nevery used this but Error 13 is permission denied in errno.h
so I've guessing something has messed with some permissions possibly
a file permission there somewhere?

If its not obvious, the you could try run it under truss to see what
is returning error 13.

   Regards
   Steve
- Original Message - 
From: O. Hartmann ohart...@zedat.fu-berlin.de


After updating ports yesterday with all this ICU update horror today
slapd rejects to start out of the blue after months of working like a
charm:

[...]
5399feba slapd startup: initiated.
5399feba backend_startup_one: starting cn=config
5399feba config_back_db_open
5399feba send_ldap_result: conn=-1 op=0 p=0
5399feba backend_startup_one: starting dc=dumami
5399feba mdb_db_open: database dc=dumami:
dbenv_open(/var/db/openldap-data/). 5399feba mdb_db_open: database
dc=dumami cannot be opened, err 13. Restore from
backup! 5399feba backend_startup_one (type=mdb,
suffix=dc=dumami): bi_db_open failed! (13) 5399feba
slapd shutdown: initiated 5399feba slapd destroy: freeing system
resources. 5399feba syncinfo_free: rid=001 5399feba syncinfo_free:
rid=003 5399feba slapd stopped.
/usr/local/etc/rc.d/slapd: WARNING: failed to start slapd

According to that useless suggestion to restore from backup, I
restored the configuration and the users from backups. slapadd works
fine. But then starting the server fails again.

Via portmaster -f openldap24-server I tried to rebuild all ports
necessary for that fragile OpenLDAP thing, but still no success. I can
not find any hints in the log (using -d1 or -d257 starting slapd)
except the failure shown above. Since the very same configuration and
dataset worked for months now and even after the massive icu-related
update of ports yesterday (ended by restarting slapd), I wouldn't
expect any usefull hint.

Can anybody offer suggestions, please? I'm out of ideas. I find it very
strange.

Regards,
Oliver


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: CURRENT: why is CURRENT swapping so fast?

2014-06-11 Thread Steven Hartland
- Original Message - 
From: Matthias Andree matthias.and...@gmx.de




Am 12.06.2014 00:36, schrieb O. Hartmann:


I use my boxes for daily work and in most cases, the usage of applications is 
the same.
Compiling the OS and updating ports while having claws-mail and firefox opened 
is some
usual scenario.

I realise since a couple of weeks, if not months now, but always sticky to 
11.0-CURRENT,
that the system is even with 8 GB RAM very quickly out of memory and swapping. 
As of
today - updating CURRENT (buildword) and also updating ports. Nothing else 
except
firefox. And the box is using 1% swapspace.


Are you using ZFS, and more to the point, did you recently start using it?

Do you mean start swapping out sooner than it used to do?

Do you expect that swap remains at 0 unless there is serious memory
pressure?

One point: Linux rolls dice when it needs memory, with a tunable that
states the chance that either a cached page gets evicted, or an in-use
page gets swapped out.

Has FreeBSD similar mechanisms these days?


Also how recent a current there where some vm changes which apparently helped 
with this
specifically r260567 and r265944.

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Fatal double fault in ZFS with yesterday's CURRENT

2014-05-04 Thread Steven Hartland


- Original Message - 
From: Fabian Keil freebsd-lis...@fabiankeil.de


Thanks for your help testing this Fabian, I've now committed the fix for
this for this:
http://svnweb.freebsd.org/changeset/base/265321

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Fatal double fault in ZFS with yesterday's CURRENT

2014-05-03 Thread Steven Hartland
- Original Message - 
From: Fabian Keil freebsd-lis...@fabiankeil.de



After updating my laptop to yesterday's CURRENT (r265216),
I got the following fatal double fault on boot:
http://www.fabiankeil.de/bilder/freebsd/kernel-panic-r265216/

My previous kernel was based on r264721.

I'm using a couple of custom patches, some of them are ZFS-related
and thus may be part of the problem (but worked fine for months).
I'll try to reproduce the panic without the patches tomorrow.



Your seeing a stack overflow in the new ZFS queuing code, which I
believe is being triggered by lack of support for TRIM in one of
your devices, something Xin reported to me yesterday.

I commited a fix for failing TRIM requests processing slowly last
night so you could try updating to after r265253 and see if that
helps.

I still need to investigate the stack overflow more directly which
appears to be caused by the new zfs queuing code when things are
running slowly and there's a large backlog of IO's.

I would be interested to know you config there so zpool layout and
hardware in the mean time.

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Fatal double fault in ZFS with yesterday's CURRENT

2014-05-03 Thread Steven Hartland

Steven Hartland kill...@multiplay.co.uk wrote:

 From: Fabian Keil freebsd-lis...@fabiankeil.de
 
  After updating my laptop to yesterday's CURRENT (r265216),

  I got the following fatal double fault on boot:
  http://www.fabiankeil.de/bilder/freebsd/kernel-panic-r265216/
  
  My previous kernel was based on r264721.

 
  I'm using a couple of custom patches, some of them are ZFS-related
  and thus may be part of the problem (but worked fine for months).
  I'll try to reproduce the panic without the patches tomorrow.
 
 
 Your seeing a stack overflow in the new ZFS queuing code, which I

 believe is being triggered by lack of support for TRIM in one of
 your devices, something Xin reported to me yesterday.
 
 I commited a fix for failing TRIM requests processing slowly last

 night so you could try updating to after r265253 and see if that
 helps.

Thanks. The hard disk is indeed unlikely to support TRIM requests,
but I can still reproduce the problem with a kernel based on r265255.


Thanks for testing, I suspect its still a numbers game with how many items
are outstanding in the queue and now that free / TRIM requests are also
now queued its triggering the failure.

If your just on a HDD try setting the following in /boot/loader.conf as
a temporary workaround:
vfs.zfs.trim.enabled=0


 I still need to investigate the stack overflow more directly which
 appears to be caused by the new zfs queuing code when things are
 running slowly and there's a large backlog of IO's.

 I would be interested to know you config there so zpool layout and
 hardware in the mean time.

The system is a Lenovo ThinkPad R500:
http://www.nycbug.org/index.cgi?action=dmesgddo=viewdmesgid=2449

I'm booting from UFS, the panic occurs while the pool is being imported.

The pool is located on a single geli-encrypted slice:

fk@r500 ~ $zpool status tank
  pool: tank
 state: ONLINE
  scan: scrub repaired 0 in 4h11m with 0 errors on Sat Mar 22 18:25:01 2014
config:

 NAME   STATE READ WRITE CKSUM
 tank   ONLINE   0 0 0
   ada0s1d.eli  ONLINE   0 0 0

errors: No known data errors

Maybe geli fails TRIM requests differently.


That helps, Xin also reported the issue with geli and thats what I'm testing
with, I believe this is a factor because is significantly slows things down
again meaning more items in the queues, but I've only managed to trigger it
once here as the machine I'm using is pretty quick.

I'll continue looking at this ASAP.

   Regards
   Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [head tinderbox] failure on ia64/ia64

2014-04-24 Thread Steven Hartland

Fixed by r264883 sorry for the breakage.

- Original Message - 
From: FreeBSD Tinderbox tinder...@freebsd.org

To: FreeBSD Tinderbox tinder...@freebsd.org; curr...@freebsd.org; 
i...@freebsd.org
Sent: Thursday, April 24, 2014 3:48 PM
Subject: [head tinderbox] failure on ia64/ia64



TB --- 2014-04-24 13:09:23 - tinderbox 2.21 running on freebsd-current.sentex.ca
TB --- 2014-04-24 13:09:23 - FreeBSD freebsd-current.sentex.ca 9.2-STABLE FreeBSD 9.2-STABLE #0 r263721: Tue Mar 25 09:27:39 EDT 
2014 d...@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC  amd64

TB --- 2014-04-24 13:09:23 - starting HEAD tinderbox run for ia64/ia64
TB --- 2014-04-24 13:09:23 - cleaning the object tree
TB --- 2014-04-24 13:09:23 - /usr/local/bin/svn stat --no-ignore /src
TB --- 2014-04-24 13:09:28 - At svn revision 264867
TB --- 2014-04-24 13:09:29 - building world
TB --- 2014-04-24 13:09:29 - CROSS_BUILD_TESTING=YES
TB --- 2014-04-24 13:09:29 - MAKEOBJDIRPREFIX=/obj
TB --- 2014-04-24 13:09:29 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2014-04-24 13:09:29 - SRCCONF=/dev/null
TB --- 2014-04-24 13:09:29 - TARGET=ia64
TB --- 2014-04-24 13:09:29 - TARGET_ARCH=ia64
TB --- 2014-04-24 13:09:29 - TZ=UTC
TB --- 2014-04-24 13:09:29 - __MAKE_CONF=/dev/null
TB --- 2014-04-24 13:09:29 - cd /src
TB --- 2014-04-24 13:09:29 - /usr/bin/make -B buildworld

Building an up-to-date make(1)
World build started on Thu Apr 24 13:09:36 UTC 2014
Rebuilding the temporary build tree
stage 1.1: legacy release compatibility shims
stage 1.2: bootstrap tools
stage 2.1: cleaning up the object tree
stage 2.2: rebuilding the object tree
stage 2.3: build tools
stage 3: cross tools
stage 4.1: building includes
stage 4.2: building libraries
stage 4.3: make dependencies
stage 4.4: building everything
World build completed on Thu Apr 24 14:44:36 UTC 2014

TB --- 2014-04-24 14:44:36 - generating LINT kernel config
TB --- 2014-04-24 14:44:36 - cd /src/sys/ia64/conf
TB --- 2014-04-24 14:44:36 - /usr/bin/make -B LINT
TB --- 2014-04-24 14:44:36 - cd /src/sys/ia64/conf
TB --- 2014-04-24 14:44:36 - /obj/ia64.ia64/src/tmp/legacy/usr/sbin/config -m 
LINT
TB --- 2014-04-24 14:44:37 - building LINT kernel
TB --- 2014-04-24 14:44:37 - CROSS_BUILD_TESTING=YES
TB --- 2014-04-24 14:44:37 - MAKEOBJDIRPREFIX=/obj
TB --- 2014-04-24 14:44:37 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2014-04-24 14:44:37 - SRCCONF=/dev/null
TB --- 2014-04-24 14:44:37 - TARGET=ia64
TB --- 2014-04-24 14:44:37 - TARGET_ARCH=ia64
TB --- 2014-04-24 14:44:37 - TZ=UTC
TB --- 2014-04-24 14:44:37 - __MAKE_CONF=/dev/null
TB --- 2014-04-24 14:44:37 - cd /src
TB --- 2014-04-24 14:44:37 - /usr/bin/make -B buildkernel KERNCONF=LINT

Kernel build for LINT started on Thu Apr 24 14:44:37 UTC 2014
stage 1: configuring the kernel
stage 2.1: cleaning up the object tree
stage 2.2: rebuilding the object tree
stage 2.3: build tools
stage 3.1: making dependencies
stage 3.2: building everything

[...]

c  -c -O2 -pipe -fno-strict-aliasing  -std=c99  -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
 -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs -fdiagnostics-show-option 
   -nostdinc  -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include 
opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror 
/src/sys/ddb/db_main.c


c  -c -O2 -pipe -fno-strict-aliasing  -std=c99  -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
 -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs -fdiagnostics-show-option 
   -nostdinc  -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include 
opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror 
/src/sys/ddb/db_output.c


c  -c -O2 -pipe -fno-strict-aliasing  -std=c99  -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
 -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs -fdiagnostics-show-option 
   -nostdinc  -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include 
opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror 
/src/sys/ddb/db_print.c


c  -c -O2 -pipe -fno-strict-aliasing  -std=c99  -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
 

Re: ZFS secondarycache on SSD problem on r255173

2014-03-03 Thread Steven Hartland
- Original Message - 
From: Andriy Gapon a...@freebsd.org




on 18/10/2013 17:57 Steven Hartland said the following:

I think we we may well need the following patch to set the minblock
size based on the vdev ashift and not SPA_MINBLOCKSIZE.

svn diff -x -p sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
===
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c(revision 
256554)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c(working copy)
@@ -5147,7 +5147,7 @@ l2arc_compress_buf(l2arc_buf_hdr_t *l2hdr)
   len = l2hdr-b_asize;
   cdata = zio_data_buf_alloc(len);
   csize = zio_compress_data(ZIO_COMPRESS_LZ4, l2hdr-b_tmp_cdata,
-   cdata, l2hdr-b_asize, (size_t)SPA_MINBLOCKSIZE);
+   cdata, l2hdr-b_asize, (size_t)(1ULL 
l2hdr-b_dev-l2ad_vdev-vdev_ashift));

   if (csize == 0) {
   /* zero block, indicate that there's nothing to write */



This is a rather old thread and change, but I think that I have identified
another problem with 4KB cache devices.

I noticed that on some of our systems we were getting a clearly abnormal number
of l2arc checksum errors accounted in l2_cksum_bad.  The hardware appeared to be
in good health.  Using DTrace I noticed that the data seemed to be overwritten
with other data.  After more DTrace analysis I observed that sometimes
l2arc_write_buffers() would advance l2ad_hand by more than target_sz.
This meant that l2arc_write_buffers() would write beyond a region cleared by
l2arc_evict() and thus overwrite data belonging to non-evicted buffers.  Havoc
ensues.

The cache devices in question are all SSDs with logical sector size of 4KB.
I am not sure about other ZFS platforms, but on FreeBSD this fact is detected
and ashift of 12 is used for the cache vdevs.

Looking at l2arc_write_buffers() code you can see that it properly accounts for
ashift when actually writing buffers and advancing l2ad_hand:
   /*
* Keep the clock hand suitably device-aligned.
*/
   buf_p_sz = vdev_psize_to_asize(dev-l2ad_vdev, buf_sz);
   write_psize += buf_p_sz;
   dev-l2ad_hand += buf_p_sz;

But the same is not done when selecting buffers to be written and checking that
target_sz is not exceeded.
So, if ARC contains a lot of buffers smaller than 4K that means that an aligned
on-disk size of the L2ARC buffers could be quite larger than their non-aligned 
size.

I propose the following patch which has been tested and seems to fix the problem
without introducing any side effects:
https://github.com/avg-I/freebsd/compare/review;l2arc-write-target-size.diff
https://github.com/avg-I/freebsd/compare/review;l2arc-write-target-size


Looks good to me.

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS command can block the whole ZFS subsystem!

2014-01-04 Thread Steven Hartland

On Fri, 3 Jan 2014 17:04:00 -
Steven Hartland kill...@multiplay.co.uk wrote:

..

 Sorry Im confused then as you said locks up the entire command and
 even worse - it seems to wind up the pool in question for being
 exported!

 Which to me read like you where saying the pool ended up being
 exported.

I'm not a native English speaker. My intention was, to make it short:

renove the dummy file. While having issued the command in the
foreground of the terminal, I decided a second later after hitting
return, to send it in the background via suspending the rm-command and
issuing bg then.


Ahh thanks for explaining :)


I expect to get the command into the background as every other
UNIX command does when sending Ctrl-Z in the console.
Obviously, ZFS related stuff in FreeBSD doesn't comply.
   
The file has been removed from the pool but the console is still
stuck with ^Z fg (as I typed this in). Process list tells me:
   
top
17790 root 1  200  8228K  1788K STOP   10   0:05
0.00% rm
   
for the particular rm command issued.
  
   Thats not backgrounded yet otherwise it wouldnt be in the state
   STOP.
 
  As I said - the job never backgrounded, locked up the terminal and
  makes the whole pool inresponsive.

 Have you tried sending a continue signal to the process?

No, not by intention. Since the operation started to slow down the
whole box and seemed to influence nearly every operation with ZFS pools
I intended (zpool status, zpool import the faulty pool, zpool export) I
rebootet the machine.

After the reboot, when ZFS came up, the drive started working like
crazy again and the system stopped while in recognizing the ZFS pools.
I did then a hard reset and restarted in single user mode, exported the
pool successfully, and rebooted. But the moment I did an zpool import
POOL, the heavy working continued.



Now, having the file deleted, I'd like to export the pool for
further maintainance
  
   Are you sure the delete is complete? Also don't forget ZFS has
   TRIM by default, so depending on support of the underlying
   devices you could be seeing deletes occuring.
 
  Quite sure it didn't! It takes hours (~ 8 now) and the drive is
  still working, although I tried to stop.

 A delete of a file shouldn't take 8 hours, but you dont say how large
 the file actually is?

The drive has a capacity of ~ 2,7 TiB (Western Digital 3TB drive). The
file I created was, do not laugh, please, 2,7 TB :-( I guess depending
on COW technique and what I read about ZFS accordingly to this thread
and others, this seems to be the culprit. There is no space left to
delete the file savely.

By the way - the box is still working on 100% on that drive :-( That's
now  12 hours.



   You can check that gstat -d
 
  command report 100% acticity on the drive. I exported the pool in
  question in single user mode and now try to import it back while in
  miltiuser mode.

 Sorry you seem to be stating conflicting things:
 1. The delete hasnt finished
 2. The pool export hung
 3. You have exported the pool


Not conflicting, but in my non-expert terminology not quite accurate
and precise as you may expect.

ad item 1) I terminated (by the brute force of the mighty RESET button)
the copy command. It hasn't finished the operation on the pool as I can
see, but it might be a kind of recovery mechanism in progress now, not
the rm-command anymore.

ad 2) Yes, first it hung, then I reset the box, then in single user
mode the export to avoid further interaction, then I tried to import
the pool again ...
ad 3) yes, successfully after the reset, now I imported the pool and
the terminal, in which I issued the command is still stuck again while
the pool is under heavy load.


 What exactly is gstat -d reporting, can you paste the output please.

I think this is boring looking at 100% activity, but here it is ;-)


dT: 1.047s  w: 1.000s
 L(q)  ops/sr/s   kBps   ms/rw/s   kBps   ms/wd/s   kBps   ms/d   
%busy Name
0  0  0  00.0  0  00.0  0  00.0
0.0| ada0
0  0  0  00.0  0  00.0  0  00.0
0.0| ada1
0  0  0  00.0  0  00.0  0  00.0
0.0| ada2
   10114114455   85.3  0  00.0  0  00.0  
100.0| ada3
0  0  0  00.0  0  00.0  0  00.0
0.0| ada4
...
   10114114455   85.3  0  00.0  0  00.0  
100.0| ada3p1
0  0  0  00.0  0  00.0  0  00.0
0.0| ada4p1


  Shortly after issuing the command
 
  zpool import POOL00
 
  the terminal is stuck again, the drive is working at 100% for two
  hours now and it seems the great ZFS is deleting every block per
  pedes. Is this supposed to last days or a week?

 What controller and what drive?

Hardware is as follows:
CPU: Intel(R) Core(TM) i7-3930K CPU

Re: ZFS command can block the whole ZFS subsystem!

2014-01-03 Thread Steven Hartland


- Original Message - 
From: O. Hartmann ohart...@zedat.fu-berlin.de


For some security reasons, I dumped via dd a large file onto a 3TB
disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20 22:43:56 CET
2013 amd64. Filesystem in question is a single ZFS pool.

Issuing the command

rm dumpfile.txt

and then hitting Ctrl-Z to bring the rm command into background via
fg (I use FreeBSD's csh in that console) locks up the entire command
and even worse - it seems to wind up the pool in question for being
exported!


I cant think of any reason why backgrounding a shell would export a pool.


I expect to get the command into the background as every other UNIX
command does when sending Ctrl-Z in the console. Obviously, ZFS
related stuff in FreeBSD doesn't comply. 


The file has been removed from the pool but the console is still stuck
with ^Z fg (as I typed this in). Process list tells me:

top
17790 root 1  200  8228K  1788K STOP   10   0:05
0.00% rm

for the particular rm command issued.


Thats not backgrounded yet otherwise it wouldnt be in the state STOP.


Now, having the file deleted, I'd like to export the pool for further
maintainance


Are you sure the delete is complete? Also don't forget ZFS has TRIM by
default, so depending on support of the underlying devices you could
be seeing deletes occuring.

You can check that gstat -d


but that doesn't work with

zpool export -f poolname

This command is now also stuck blocking the terminal and the pool from
further actions.


If the delete hasnt completed and is stuck in the kernel this is
to be expected.


This is painful. Last time I faced the problem, I had to reboot prior
to take any action regarding any pool in the system, since one single
ZFS command could obviously block the whole subsystem (I tried to
export and import).

What is up here?


   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


  1   2   >