Re: about zfs and ashift and changing ashift on existing zpool

2019-04-08 Thread Michael Butler
On 2019-04-08 20:55, Alexander Motin wrote:
> On 08.04.2019 20:21, Eugene Grosbein wrote:
>> 09.04.2019 7:00, Kevin P. Neal wrote:
>>
 My guess (given that only ada1 is reporting a blocksize mismatch) is that
 your disks reported a 512B native blocksize.  In the absence of any 
 override,
 ZFS will then build an ashift=9 pool.
>>
>> [skip]
>>
>>> smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.2-RELEASE-p4 amd64] (local build)
>>> Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
>>>
>>> === START OF INFORMATION SECTION ===
>>> Vendor:   SEAGATE
>>> Product:  ST2400MM0129
>>> Revision: C003
>>> Compliance:   SPC-4
>>> User Capacity:2,400,476,553,216 bytes [2.40 TB]
>>> Logical block size:   512 bytes
>>> Physical block size:  4096 bytes
>>
>> Maybe it't time to prefer "Physical block size" over "Logical block size" in 
>> relevant GEOMs
>> like GEOM_DISK, so upper levels such as ZFS would do the right thing 
>> automatically.
> 
> No.  It is a bad idea.  Changing logical block size for existing disks
> will most likely result in breaking compatibility and inability to read
> previously written data.  ZFS already uses physical block size when
> possible -- on pool creation or new vdev addition.  When not possible
> (pool already created wrong) it just complains about it, so that user
> would know that his configuration is imperfect and he should not expect
> full performance.

And some drives just present 512 bytes for both .. no idea if this is
consistent with the underlying silicon :-( I built a ZFS pool on it
using 4k blocks anyway.

smartctl 7.0 2018-12-30 r4883 [FreeBSD 13.0-CURRENT amd64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: WDC WDS100T2B0A-00SM50
Serial Number:1837B0803409
LU WWN Device Id: 5 001b44 8b99f7560
Firmware Version: X61190WD
User Capacity:1,000,204,886,016 bytes [1.00 TB]
Sector Size:  512 bytes logical/physical
Rotation Rate:Solid State Device
Form Factor:  2.5 inches
Device is:Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:Mon Apr  8 21:22:15 2019 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is: 128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

imb


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: about zfs and ashift and changing ashift on existing zpool

2019-04-08 Thread Alexander Motin
On 08.04.2019 20:21, Eugene Grosbein wrote:
> 09.04.2019 7:00, Kevin P. Neal wrote:
> 
>>> My guess (given that only ada1 is reporting a blocksize mismatch) is that
>>> your disks reported a 512B native blocksize.  In the absence of any 
>>> override,
>>> ZFS will then build an ashift=9 pool.
> 
> [skip]
> 
>> smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.2-RELEASE-p4 amd64] (local build)
>> Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
>>
>> === START OF INFORMATION SECTION ===
>> Vendor:   SEAGATE
>> Product:  ST2400MM0129
>> Revision: C003
>> Compliance:   SPC-4
>> User Capacity:2,400,476,553,216 bytes [2.40 TB]
>> Logical block size:   512 bytes
>> Physical block size:  4096 bytes
> 
> Maybe it't time to prefer "Physical block size" over "Logical block size" in 
> relevant GEOMs
> like GEOM_DISK, so upper levels such as ZFS would do the right thing 
> automatically.

No.  It is a bad idea.  Changing logical block size for existing disks
will most likely result in breaking compatibility and inability to read
previously written data.  ZFS already uses physical block size when
possible -- on pool creation or new vdev addition.  When not possible
(pool already created wrong) it just complains about it, so that user
would know that his configuration is imperfect and he should not expect
full performance.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: about zfs and ashift and changing ashift on existing zpool

2019-04-08 Thread Eugene Grosbein
09.04.2019 7:00, Kevin P. Neal wrote:

>> My guess (given that only ada1 is reporting a blocksize mismatch) is that
>> your disks reported a 512B native blocksize.  In the absence of any override,
>> ZFS will then build an ashift=9 pool.

[skip]

> smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.2-RELEASE-p4 amd64] (local build)
> Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Vendor:   SEAGATE
> Product:  ST2400MM0129
> Revision: C003
> Compliance:   SPC-4
> User Capacity:2,400,476,553,216 bytes [2.40 TB]
> Logical block size:   512 bytes
> Physical block size:  4096 bytes

Maybe it't time to prefer "Physical block size" over "Logical block size" in 
relevant GEOMs
like GEOM_DISK, so upper levels such as ZFS would do the right thing 
automatically.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: about zfs and ashift and changing ashift on existing zpool

2019-04-08 Thread Peter Jeremy
On 2019-Apr-07 16:36:40 +0100, tech-lists  wrote:
>storage  ONLINE   0 0 0
>  raidz1-0   ONLINE   0 0 0
>replacing-0  ONLINE   0 0 1.65K
>  ada2   ONLINE   0 0 0
>  ada1   ONLINE   0 0 0  block size: 512B configured, 
> 4096B native
>ada3 ONLINE   0 0 0
>ada4 ONLINE   0 0 0
>
>What I'd like to know is:
>
>1. is the above situation harmful to data

In general no.  The only danger is that ZFS is updating the uberblock
replicas at the start and end of the volume assuming 512B sectors which
means you are at a higher risk or losing one of the replica sets if a
power failure occurs during an uberblock update.

>2. given that vfs.zfs.min_auto_ashift=12, why does it still say 512B
>   configured for ada1 which is the new disk, or..
The pool is configured with ashift=9.

>3. does "configured" pertain to the pool, the disk, or both
"configured" relates to the pool - all vdevs match the pool

>4. what would be involved in making them all 4096B
Rebuild the pool - backup/destroy/create/restore

>5. does a 512B disk wear out faster than 4096B (all other things being
>   equal)
It shouldn't.  It does mean that the disk is doing read/modify/write at
the physical sector level but that should be masked by the drive cache.

>Given that the machine and disks were new in 2016, I can't understand why zfs
>didn't default to 4096B on installation

I can't answer that easily.  The current version of ZFS looks at the native
disk blocksize to determine the pool ashift but I'm not sure how things
were in 2016.  Possibilities include:
* The pool was built explicitly with ashift=9
* The initial disks reported 512B native (I think this is most likely)
* That version of ZFS was using logical, rather than native blocksize.

My guess (given that only ada1 is reporting a blocksize mismatch) is that
your disks reported a 512B native blocksize.  In the absence of any override,
ZFS will then build an ashift=9 pool.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Can someone MFC usb/234503

2019-04-08 Thread Mel Pilgrim
The patch was committed to head on March 11 and flagged for MFC, but it 
hasn't be merged to stable yet.  I've been running this modification on 
11.2-R and 12.0-R using the affected devices as system disks for a 
little over two months without issue.  Can someone merge this into 
stable/11 and stable/12?  It's important (to me, at least) that it make 
it into 11.3-R, and the slush for it isn't far off.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: EFI loader doesn't handle md_preload (md_image) correct?

2019-04-08 Thread Toomas Soome via freebsd-stable
Yes, I still do remember it, just need to find time to port the feature. 

Of course the root cause there is much more complicated (we can not assume to 
have contigous space above 1MB), but we mostly can cope...

Sent from my iPhone

> On 8 Apr 2019, at 09:44, Harry Schmalzbauer  wrote:
> 
>> Am 16.05.2017 um 18:26 schrieb Harry Schmalzbauer:
>> Bezüglich Toomas Soome's Nachricht vom 16.05.2017 18:20 (localtime):
 On 16. mai 2017, at 19:13, Harry Schmalzbauer  wrote:
 
 Bezüglich Toomas Soome's Nachricht vom 16.05.2017 18:00 (localtime):
>> On 16. mai 2017, at 18:45, Harry Schmalzbauer > > wrote:
>> 
>> Bezüglich Harry Schmalzbauer's Nachricht vom 16.05.2017 17:28 
>> (localtime):
>>> Bezüglich Toomas Soome's Nachricht vom 16.05.2017 16:57 (localtime):
> On 16. mai 2017, at 17:55, Harry Schmalzbauer  > wrote:
> 
> Hello,
> 
> unfortunately I had some trouble with my preferred MFS-root setups.
> It seems EFI loader doesn't handle type md_image correctly.
> 
> If I load any md_image with loader invoked by gptboot or gptzfsboot,
> 'lsmod'
> shows "elf kernel", "elf obj module(s)" and "md_image".
> 
> Using the same loader.conf, but EFI loader, the md_image-file is
> prompted and sems to be loaded, but not registered.  There's no
> md_image
> with 'lsmod', hence it's not astonsihing that kernel doesn't attach 
> md0
> so booting fails since there's no rootfs.
> 
> Any help highly appreciated, hope Toomas doesn't mind beeing
> initially CC'd.
> 
> Thanks,
> 
> -harry
 The first question is, how large is the md_image and what other
 modules are loaded?
>>> Thanks for your quick response.
>>> 
>>> The images are 50-500MB uncompressed (provided by gzip compressed file).
>>> Small ammount of elf modules, 5, each ~50kB.
>> On the real HW, there's vmm and some more:
>> Id Refs Address Size Name
>> 1   46 0x8020   16M kernel
>> 21 0x8121d000   86K unionfs.ko
>> 31 0x81233000  3.1M zfs.ko
>> 42 0x81545000   51K opensolaris.ko
>> 57 0x81552000  279K usb.ko
>> 61 0x81598000   67K ukbd.ko
>> 71 0x815a9000   51K umass.ko
>> 81 0x815b6000   46K aesni.ko
>> 91 0x815c3000   54K uhci.ko
>> 101 0x815d1000   65K ehci.ko
>> 111 0x815e2000   15K cc_htcp.ko
>> 121 0x815e6000  3.4M vmm.ko
>> 131 0xa3a21000   12K ums.ko
>> 141 0xa3a24000  9.1K uhid.ko
>> 
>> Providing md_image uncompressed doesn't change anything.
>> 
>> Will deploy a /usr separated rootfs, which is only ~100MB uncompressed
>> and see if that changes anything.
>> That's all I can provide, code is far beyond my knowledge...
>> 
>> -harry
> 
> The issue is, that current UEFI implementation is using 64MB staging
> memory for loading the kernel and modules and files. When the boot is
> called, the relocation code will put the bits from staging area into the
> final places. The BIOS version does not need such staging area, and that
> will explain the difference.
> 
> I actually have different implementation to address the same problem,
> but thats for illumos case, and will need some work to make it usable
> for freebsd; the idea is actually simple - allocate staging area per
> loaded file and relocate the bits into the place by component, not as
> continuous large chunk (this would also allow to avoid the mines like
> planted by hyperv;), but right now there is no very quick real solution
> other than just build efi loader with larger staging size.
 Ic, thanks for the explanation.
 While not aware about the purpose of the staging area nor the
 consequences of enlarging it, do you think it's feasable increasing it
 to 768Mib?
 
 At least now I have an idea baout the issue and an explanation why
 reducing md_imgae to 100MB hasn't helped – still more than 64...
 
 Any quick hint where to define the staging area size highly appreciated,
 fi there are no hard objections against a 768MB size.
 
 -harry
>>> The problem is that before UEFI Boot Services are not switched off, the 
>>> memory is managed (and owned) by the firmware,
>> Hmm, I've been expecting something like that (owend by firmware) ;-)
>> 
>> So I'll stay with CSM for now, and will happily be an early adopter if
>> you need someone to try anything (-stable mergable).
> 
> Hello Toomas,
> 
> thanks for your ongoing FreeBSD commits, saw your recent libstand 
> improvements and the efiloader commit.
> Which remembers me 

Re: EFI loader doesn't handle md_preload (md_image) correct?

2019-04-08 Thread Harry Schmalzbauer

Am 16.05.2017 um 18:26 schrieb Harry Schmalzbauer:

Bezüglich Toomas Soome's Nachricht vom 16.05.2017 18:20 (localtime):

On 16. mai 2017, at 19:13, Harry Schmalzbauer  wrote:

Bezüglich Toomas Soome's Nachricht vom 16.05.2017 18:00 (localtime):

On 16. mai 2017, at 18:45, Harry Schmalzbauer mailto:free...@omnilan.de>> wrote:

Bezüglich Harry Schmalzbauer's Nachricht vom 16.05.2017 17:28 (localtime):

Bezüglich Toomas Soome's Nachricht vom 16.05.2017 16:57 (localtime):

On 16. mai 2017, at 17:55, Harry Schmalzbauer mailto:free...@omnilan.de>> wrote:

Hello,

unfortunately I had some trouble with my preferred MFS-root setups.
It seems EFI loader doesn't handle type md_image correctly.

If I load any md_image with loader invoked by gptboot or gptzfsboot,
'lsmod'
shows "elf kernel", "elf obj module(s)" and "md_image".

Using the same loader.conf, but EFI loader, the md_image-file is
prompted and sems to be loaded, but not registered.  There's no
md_image
with 'lsmod', hence it's not astonsihing that kernel doesn't attach md0
so booting fails since there's no rootfs.

Any help highly appreciated, hope Toomas doesn't mind beeing
initially CC'd.

Thanks,

-harry

The first question is, how large is the md_image and what other
modules are loaded?

Thanks for your quick response.

The images are 50-500MB uncompressed (provided by gzip compressed file).
Small ammount of elf modules, 5, each ~50kB.

On the real HW, there's vmm and some more:
Id Refs Address Size Name
1   46 0x8020   16M kernel
21 0x8121d000   86K unionfs.ko
31 0x81233000  3.1M zfs.ko
42 0x81545000   51K opensolaris.ko
57 0x81552000  279K usb.ko
61 0x81598000   67K ukbd.ko
71 0x815a9000   51K umass.ko
81 0x815b6000   46K aesni.ko
91 0x815c3000   54K uhci.ko
101 0x815d1000   65K ehci.ko
111 0x815e2000   15K cc_htcp.ko
121 0x815e6000  3.4M vmm.ko
131 0xa3a21000   12K ums.ko
141 0xa3a24000  9.1K uhid.ko

Providing md_image uncompressed doesn't change anything.

Will deploy a /usr separated rootfs, which is only ~100MB uncompressed
and see if that changes anything.
That's all I can provide, code is far beyond my knowledge...

-harry


The issue is, that current UEFI implementation is using 64MB staging
memory for loading the kernel and modules and files. When the boot is
called, the relocation code will put the bits from staging area into the
final places. The BIOS version does not need such staging area, and that
will explain the difference.

I actually have different implementation to address the same problem,
but thats for illumos case, and will need some work to make it usable
for freebsd; the idea is actually simple - allocate staging area per
loaded file and relocate the bits into the place by component, not as
continuous large chunk (this would also allow to avoid the mines like
planted by hyperv;), but right now there is no very quick real solution
other than just build efi loader with larger staging size.

Ic, thanks for the explanation.
While not aware about the purpose of the staging area nor the
consequences of enlarging it, do you think it's feasable increasing it
to 768Mib?

At least now I have an idea baout the issue and an explanation why
reducing md_imgae to 100MB hasn't helped – still more than 64...

Any quick hint where to define the staging area size highly appreciated,
fi there are no hard objections against a 768MB size.

-harry

The problem is that before UEFI Boot Services are not switched off, the memory 
is managed (and owned) by the firmware,

Hmm, I've been expecting something like that (owend by firmware) ;-)

So I'll stay with CSM for now, and will happily be an early adopter if
you need someone to try anything (-stable mergable).


Hello Toomas,

thanks for your ongoing FreeBSD commits, saw your recent libstand 
improvements and the efiloader commit.

Which remembers me nagging the skilled ones for my unmet needs ;-)

I guess nobody had time to look at the MFS-root limitation with EFI vs. 
BIOS.

If you have any news/plans, please share.
The ability to boot via EFI gives a much better console 
experience/usability for admins, but on MFS-root system, I'm still 
forced to use the old loader path, because of the 64MB size limit.


Do you think there's a chance that this will be resolved for FreeBSD?

Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"