[OmniOS-discuss] Badlock SMB bug -- do we know if Illumos kernel CIFS implementation is also affected?

2016-03-23 Thread Dave Pooser
See http://badlock.org/ for more information. I'd like to assume that the
devs in question would have communicated with other SMB implementations
like the Solaris/Illumos and Apple's implementation, but
-- 
Dave Pooser
Cat-Herder-in-Chief, Pooserville.com


___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-23 Thread Bob Friesenhahn

On Wed, 23 Mar 2016, Richard Elling wrote:




On Mar 23, 2016, at 7:49 AM, Richard Jahnel  wrote:

It should be noted that using a 512e disk as a 512n disk subjects you to a 
significant risk of silent corruption in the event of power loss. Because 512e disks 
does a read>modify>write operation to modify 512byte chunk of a 4k sector, zfs 
won't know about the other 7 corrupted 512e sectors in the event of a power loss 
during a write operation. So when discards the incomplete txg on reboot, it won't do 
anything about the other 7 512e sectors it doesn't know were affected.


Disagree. The risk is no greater than HDDs today with their volatile write 
caches.


If the data unrelated to the current transaction group is read and 
then partially modifed (possibly with data corruption due to loss of 
power during write), this would seem to be worse than loss due to a 
volatile write cache (assuming drives which observe cache sync 
requests) since data unrelated to the current transaction group may 
have been modified.  The end result would be checksum errors during a 
scrub.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] Good way to debug DTrace invalid address errors?

2016-03-23 Thread Chris Siebenmann
 I have a relatively complicated chunk of dtrace code that reads kernel
data structures and chases pointers through them. Some of the time it
spits out 'invalid address' errors during execution, for example:

dtrace: error on enabled probe ID 8 (ID 75313: 
fbt:nfssrv:nfs3_fhtovp:return): invalid address (0x2e8) in action #6 at DIF 
offset 40
dtrace: error on enabled probe ID 8 (ID 75313: 
fbt:nfssrv:nfs3_fhtovp:return): invalid address (0xbaddcafebaddcc7e) in action 
#6 at DIF offset 60

I'd like to find out exactly what pointer dereferences or other
operations are failing here, so that I can figure out how to work
around the issue. However, I have no solid idea how to map things
like 'probe ID 8' and 'DIF offset 60' to particular lines in my
DTrace source code.

 I assume that the answer to this involves reading DIF (the DTrace
intermediate form). I've looked at 'dtrace -Se' output from this
DTrace script, but I can't identify the spot I need to look at.
In particular, as far as I can see nothing in the output has
instructions with an offset as high as 40 or 60.

 I can flail around sticking guards in and varying how I do stuff
to make the errors go away, but I'd like to understand how to debug
this sort of stuff so I can have more confidence in my changes.

 Thanks in advance for any suggestions, and if people want to see
the actual code involved it is this DTrace script:

https://github.com/siebenmann/cks-dtrace/blob/master/nfs3-long.d

(Look at line 144 for the specific dtrace probe that is probably
failing, since it's the only probe on fbt:nfssrv:nfs3_fhtovp:return.)

- cks
PS: it's entirely possible that there's a better way to do what I'm
trying here, too. These DTrace scripts were originally written
on Solaris 10 update 8 and haven't been drastically revised
since.
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-23 Thread Richard Elling

> On Mar 23, 2016, at 7:36 AM, Chris Siebenmann  wrote:
> 
>>> The sd.conf whitelist also requires a reboot to activate if you need
>>> to add a new entry, as far as I know.
>>> 
>>>(Nor do I know what happens if you have some 512n disks and
>>>some 512e disks, both correctly recognized and in different
>>>pools, and now you need to replace a 512n disk with a spare 512e
>>>disk so you change sd.conf to claim that all of the 512e disks
>>>are 512n. I'd like to think that ZFS will carry on as normal,
>>>but I'm not sure.  This makes it somewhat dangerous to change
>>>sd.conf on a live system.)
>> 
>> There are two cases if we don't use the remedy (whitelist in illumos
>> or -o ashift in ZoL) here:
>> a): 512n <---> 512e. This replacement should work *in theory* if the
>> lie works *correctly*.
> 
> This will not work without the sd.conf workaround in Illumos.
> 
> All 512e disks that I know of correctly report their actual physical
> disk size to Illumos (and to Linux/ZoL). When a disk reports a 4K
> physical sector size, ZFS will refuse to allow it into an ashift=9
> vdev *regardless* of the fact that it is 512e and will accept reads
> and writes in 512-byte sectors.
> 
> In Illumos, you can use sd.conf to lie to the system and claim that
> this is not a 512e but a 512n disk (ie, it has a 512 byte physical
> sector size). I don't believe there's an equivalent on ZoL, but I
> haven't looked.
> 
> This absolute insistence on ZFS's part is what makes ashift=9 vdevs so
> dangerous today, because you cannot replace existing 512n disks in them
> with 512e disks without (significant) hackery.
> 
> (Perhaps I'm misunderstanding what people mean by '512e' here; I've
> been assuming it means a disk which reports 512 byte logical sectors and
> 4k physical sectors. Such disks are what you commonly get today.)

Yes. 512e means:
 un_phy_blocksize = 4096 (or 8192)
 un_tgt_blocksize = 512
for disks that don't lie. Lying disks claim un_phy_blocksize = 512 when it 
isn't.

At this point, before the discussion degenerates further, remember that George
covered this in detail at the OpenZFS conference and in his blog.
http://blog.delphix.com/gwilson/2012/11/15/4k-sectors-and-zfs/ 


http://www.youtube.com/watch?v=TmH3iRLhZ-A&feature=youtu.be


 -- richard

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-23 Thread Chris Siebenmann
> It should be noted that using a 512e disk as a 512n disk subjects
> you to a significant risk of silent corruption in the event of power
> loss. Because 512e disks does a read>modify>write operation to
> modify 512byte chunk of a 4k sector, zfs won't know about the other
> 7 corrupted 512e sectors in the event of a power loss during a write
> operation. So when discards the incomplete txg on reboot, it won't do
> anything about the other 7 512e sectors it doesn't know were affected.

 This is true; under normal circumstances you do not want to use a
512e drive in an ashift=9 vdev. However, if you have a dead 512n drive
and you have no remaining 512n spares, your choices are to run without
redundancy, to wedge in a 512e drive and accept the potential problems
on power failure (problems that can likely be fixed by scrubbing the
pool afterwards), or obtain enough additional drives (and perhaps
server(s)) to entirely rebuild the pool on 512e drives with ashift=12.

 In this situation, running with a 512e drive and accepting the
performance issues and potential exposure to power failures is basically
the lesser evil. (I wish ZFS was willing to accept this, but it isn't.)

- cks
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-23 Thread Richard Elling

> On Mar 23, 2016, at 7:49 AM, Richard Jahnel  wrote:
> 
> It should be noted that using a 512e disk as a 512n disk subjects you to a 
> significant risk of silent corruption in the event of power loss. Because 
> 512e disks does a read>modify>write operation to modify 512byte chunk of a 4k 
> sector, zfs won't know about the other 7 corrupted 512e sectors in the event 
> of a power loss during a write operation. So when discards the incomplete txg 
> on reboot, it won't do anything about the other 7 512e sectors it doesn't 
> know were affected.

Disagree. The risk is no greater than HDDs today with their volatile write 
caches.
 -- richard

> 
> Richard Jahnel
> Network Engineer
> On-Site.com | Ellipse Design
> 866-266-7483 ext. 4408
> Direct: 669-800-6270
> 
> 
> -Original Message-
> From: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] On 
> Behalf Of Chris Siebenmann
> Sent: Wednesday, March 23, 2016 9:36 AM
> To: Fred Liu 
> Cc: omnios-discuss@lists.omniti.com
> Subject: Re: [OmniOS-discuss] 4kn or 512e with ashift=12
> 
>>> The sd.conf whitelist also requires a reboot to activate if you need
>>> to add a new entry, as far as I know.
>>> 
>>>(Nor do I know what happens if you have some 512n disks and
>>>some 512e disks, both correctly recognized and in different
>>>pools, and now you need to replace a 512n disk with a spare 512e
>>>disk so you change sd.conf to claim that all of the 512e disks
>>>are 512n. I'd like to think that ZFS will carry on as normal,
>>>but I'm not sure.  This makes it somewhat dangerous to change
>>>sd.conf on a live system.)
>> 
>> There are two cases if we don't use the remedy (whitelist in illumos 
>> or -o ashift in ZoL) here:
>> a): 512n <---> 512e. This replacement should work *in theory* if the 
>> lie works *correctly*.
> 
> This will not work without the sd.conf workaround in Illumos.
> 
> All 512e disks that I know of correctly report their actual physical disk 
> size to Illumos (and to Linux/ZoL). When a disk reports a 4K physical sector 
> size, ZFS will refuse to allow it into an ashift=9 vdev *regardless* of the 
> fact that it is 512e and will accept reads and writes in 512-byte sectors.
> 
> In Illumos, you can use sd.conf to lie to the system and claim that this is 
> not a 512e but a 512n disk (ie, it has a 512 byte physical sector size). I 
> don't believe there's an equivalent on ZoL, but I haven't looked.
> 
> This absolute insistence on ZFS's part is what makes ashift=9 vdevs so 
> dangerous today, because you cannot replace existing 512n disks in them with 
> 512e disks without (significant) hackery.
> 
> (Perhaps I'm misunderstanding what people mean by '512e' here; I've been 
> assuming it means a disk which reports 512 byte logical sectors and 4k 
> physical sectors. Such disks are what you commonly get today.)
> 
>   - cks
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-23 Thread Richard Jahnel
It should be noted that using a 512e disk as a 512n disk subjects you to a 
significant risk of silent corruption in the event of power loss. Because 512e 
disks does a read>modify>write operation to modify 512byte chunk of a 4k 
sector, zfs won't know about the other 7 corrupted 512e sectors in the event of 
a power loss during a write operation. So when discards the incomplete txg on 
reboot, it won't do anything about the other 7 512e sectors it doesn't know 
were affected.

Richard Jahnel
Network Engineer
On-Site.com | Ellipse Design
866-266-7483 ext. 4408
Direct: 669-800-6270


-Original Message-
From: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] On Behalf 
Of Chris Siebenmann
Sent: Wednesday, March 23, 2016 9:36 AM
To: Fred Liu 
Cc: omnios-discuss@lists.omniti.com
Subject: Re: [OmniOS-discuss] 4kn or 512e with ashift=12

> > The sd.conf whitelist also requires a reboot to activate if you need
> > to add a new entry, as far as I know.
> > 
> > (Nor do I know what happens if you have some 512n disks and
> > some 512e disks, both correctly recognized and in different
> > pools, and now you need to replace a 512n disk with a spare 512e
> > disk so you change sd.conf to claim that all of the 512e disks
> > are 512n. I'd like to think that ZFS will carry on as normal,
> > but I'm not sure.  This makes it somewhat dangerous to change
> > sd.conf on a live system.)
> 
> There are two cases if we don't use the remedy (whitelist in illumos 
> or -o ashift in ZoL) here:
> a): 512n <---> 512e. This replacement should work *in theory* if the 
> lie works *correctly*.

 This will not work without the sd.conf workaround in Illumos.

 All 512e disks that I know of correctly report their actual physical disk size 
to Illumos (and to Linux/ZoL). When a disk reports a 4K physical sector size, 
ZFS will refuse to allow it into an ashift=9 vdev *regardless* of the fact that 
it is 512e and will accept reads and writes in 512-byte sectors.

 In Illumos, you can use sd.conf to lie to the system and claim that this is 
not a 512e but a 512n disk (ie, it has a 512 byte physical sector size). I 
don't believe there's an equivalent on ZoL, but I haven't looked.

 This absolute insistence on ZFS's part is what makes ashift=9 vdevs so 
dangerous today, because you cannot replace existing 512n disks in them with 
512e disks without (significant) hackery.

(Perhaps I'm misunderstanding what people mean by '512e' here; I've been 
assuming it means a disk which reports 512 byte logical sectors and 4k physical 
sectors. Such disks are what you commonly get today.)

- cks
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-23 Thread Chris Siebenmann
> > The sd.conf whitelist also requires a reboot to activate if you need
> > to add a new entry, as far as I know.
> > 
> > (Nor do I know what happens if you have some 512n disks and
> > some 512e disks, both correctly recognized and in different
> > pools, and now you need to replace a 512n disk with a spare 512e
> > disk so you change sd.conf to claim that all of the 512e disks
> > are 512n. I'd like to think that ZFS will carry on as normal,
> > but I'm not sure.  This makes it somewhat dangerous to change
> > sd.conf on a live system.)
> 
> There are two cases if we don't use the remedy (whitelist in illumos
> or -o ashift in ZoL) here:
> a): 512n <---> 512e. This replacement should work *in theory* if the
> lie works *correctly*.

 This will not work without the sd.conf workaround in Illumos.

 All 512e disks that I know of correctly report their actual physical
disk size to Illumos (and to Linux/ZoL). When a disk reports a 4K
physical sector size, ZFS will refuse to allow it into an ashift=9
vdev *regardless* of the fact that it is 512e and will accept reads
and writes in 512-byte sectors.

 In Illumos, you can use sd.conf to lie to the system and claim that
this is not a 512e but a 512n disk (ie, it has a 512 byte physical
sector size). I don't believe there's an equivalent on ZoL, but I
haven't looked.

 This absolute insistence on ZFS's part is what makes ashift=9 vdevs so
dangerous today, because you cannot replace existing 512n disks in them
with 512e disks without (significant) hackery.

(Perhaps I'm misunderstanding what people mean by '512e' here; I've
been assuming it means a disk which reports 512 byte logical sectors and
4k physical sectors. Such disks are what you commonly get today.)

- cks
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-23 Thread Fred Liu


> -Original Message-
> From: Richard Elling [mailto:richard.ell...@richardelling.com]
> Sent: 星期三, 三月 23, 2016 4:53
> To: Chris Siebenmann
> Cc: Fred Liu; omnios-discuss@lists.omniti.com
> Subject: Re: [OmniOS-discuss] 4kn or 512e with ashift=12
> 
> 
>   On Mar 22, 2016, at 7:41 AM, Chris Siebenmann   > wrote:
> 
> 
>   This implicitly assumes that the only reason to set 
> ashift=12 is
>   if you are currently using one or more drives that 
> require it. I
>   strongly disagree with this view. Since ZFS cannot 
> currently
> replace
>   a 512n drive with a 512e one, I feel [...]
> 
> 
> 
>   *In theory* this replacement should work well if the lie works
> *correctly*.
>   In ZoL, for the "-o ashift" is supported in "zpool replace", the
>   replacement should also work in mixed sector sizes.
>   And in illumos the whitelist will do the same.
>   What errors have you ever seen?
> 
> 
> 
>   We have seen devices that changed between (claimed) 512n and
>   (claimed) 512e/4k *within the same model number*; the only thing that
>   distinguished the two was firmware version (which is not something that
>   you can match in sd.conf). This came as a complete surprise to us the
>   first time we needed to replace an old (512n) one of these with a new
>   (512e) one.
> 
>   The sd.conf whitelist also requires a reboot to activate if you need
>   to add a new entry, as far as I know.
> 
>   (Nor do I know what happens if you have some 512n disks and some
>   512e disks, both correctly recognized and in different pools, and
>   now you need to replace a 512n disk with a spare 512e disk so you
>   change sd.conf to claim that all of the 512e disks are 512n. I'd
>   like to think that ZFS will carry on as normal, but I'm not sure.
>   This makes it somewhat dangerous to change sd.conf on a live system.)

There are two cases if we don't use the remedy (whitelist in illumos or -o 
ashift in ZoL) here:
a): 512n <---> 512e. This replacement should work *in theory* if the lie works 
*correctly*.
b): 512n <-x-> 4kn.  This replacement may not work for the different physical 
sector sizes.

Your surprise may come from case b.
> 
> 
> 
> What is missing from
> http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks
> is:
> 
> 1. how to change the un_phy_blocksize for any or all uns 2. how to set a 
> default
> setting for all drives in sd.conf by setting attributes to
> the "" of ""  (see sd(7d))
> 
> I am aware of no new HDDs with 512n, so this problem will go away for HDDs.
> However, there are many SSDs that work better with un_phy_blocksize = 8192
> and some vendors set sd.conf or source appropriately.
>  -- richard
> 
> 
> 
>   For many usage cases, somewhat more space usage and
> perhaps
>   somewhat slower pools are vastly preferable to a loss 
> of pool
>   redundancy over time. I feel that OmniOS should at 
> least give
> you
>   the option here (in a less crude way than simply 
> telling it that
>   absolutely all of your drives are 4k drives, partly 
> because such
>   general lies are problematic in various situations).
> 
> 
> 
>   The whitelist (sd.conf) should fit into this consideration. But 
> not
>   sure how mixed sector sizes impact the performance.
> 
> 
> 
>   Oh, 512e disks in a 512n pool will probably have not great performance.
>   ZFS does a lot of unaligned reads and writes, unlike other filesystems;
>   if you say your disks are 512n, it really believes you and behaves
>   accordingly.

I am just curious about if the mixed sector sizes(512n+4kn) will impact 
performance.

Thanks.

Fred
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss