RE: A blocksize problem about dax and ext4
> -Original Message- > From: Christoph Hellwig [mailto:h...@infradead.org] > Sent: Thursday, December 24, 2015 4:11 AM > Subject: Re: A blocksize problem about dax and ext4 > > On Thu, Dec 24, 2015 at 02:47:07AM +, Elliott, Robert (Persistent > Memory) wrote: > > > Did you mean that I should make the blocksize bigger until the mount > > > command tell me that dax is enabled? > > > > To really use DAX, the filesystem block size must match the > > system CPU's page size, which is probably 4096 bytes. > > No, it doesn't. File you use for DAX must be aligne at page size > granularity. For XFS you could do this with the per-inode extent > size hint for example even if the overall block size is smaller. I think that's a future goal. Currently, the checks are like this: if (sb->s_blocksize != PAGE_SIZE) { xfs_alert(mp, "Filesystem block size invalid for DAX Turning DAX off."); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A blocksize problem about dax and ext4
On Thu, Dec 24, 2015 at 02:47:07AM +, Elliott, Robert (Persistent Memory) wrote: > > Did you mean that I should make the blocksize bigger until the mount > > command tell me that dax is enabled? > > To really use DAX, the filesystem block size must match the > system CPU's page size, which is probably 4096 bytes. No, it doesn't. File you use for DAX must be aligne at page size granularity. For XFS you could do this with the per-inode extent size hint for example even if the overall block size is smaller. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: A blocksize problem about dax and ext4
> -Original Message- > From: Christoph Hellwig [mailto:h...@infradead.org] > Sent: Thursday, December 24, 2015 4:11 AM > Subject: Re: A blocksize problem about dax and ext4 > > On Thu, Dec 24, 2015 at 02:47:07AM +, Elliott, Robert (Persistent > Memory) wrote: > > > Did you mean that I should make the blocksize bigger until the mount > > > command tell me that dax is enabled? > > > > To really use DAX, the filesystem block size must match the > > system CPU's page size, which is probably 4096 bytes. > > No, it doesn't. File you use for DAX must be aligne at page size > granularity. For XFS you could do this with the per-inode extent > size hint for example even if the overall block size is smaller. I think that's a future goal. Currently, the checks are like this: if (sb->s_blocksize != PAGE_SIZE) { xfs_alert(mp, "Filesystem block size invalid for DAX Turning DAX off."); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A blocksize problem about dax and ext4
On Thu, Dec 24, 2015 at 02:47:07AM +, Elliott, Robert (Persistent Memory) wrote: > > Did you mean that I should make the blocksize bigger until the mount > > command tell me that dax is enabled? > > To really use DAX, the filesystem block size must match the > system CPU's page size, which is probably 4096 bytes. No, it doesn't. File you use for DAX must be aligne at page size granularity. For XFS you could do this with the per-inode extent size hint for example even if the overall block size is smaller. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: A blocksize problem about dax and ext4
> -Original Message- > From: Cholerae Hu [mailto:cholerae...@gmail.com] > Sent: Wednesday, December 23, 2015 8:36 PM > Subject: Re: A blocksize problem about dax and ext4 ... > xfs will silently disable dax when the fs block size is too small, > i.e. your mmap() operations are backed by page cache in this case. > Currently the only indication of whether a mapping is DAX backed or > not is the presence of the VM_MIXEDMAP flag ("mm" in the VmFlags field > of /proc//smaps) > > Did you mean that I should make the blocksize bigger until the mount > command tell me that dax is enabled? To really use DAX, the filesystem block size must match the system CPU's page size, which is probably 4096 bytes. --- Robert Elliott, HPE Persistent Memory N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
Re: A blocksize problem about dax and ext4
On Wed, Dec 23, 2015 at 4:34 PM, Cholerae Hu wrote: > The block size is 1024. > # dumpe2fs -h /dev/pmem0 | grep "Block size" > dumpe2fs 1.42.13 (17-May-2015) > Block size: 1024 > > I tried it out on xfs and I succeeded. There are the prompting messages: > # mkfs.xfs -f -b size=1024 /dev/pmem0 > meta-data=/dev/pmem0 isize=512agcount=4, agsize=32768 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=1finobt=1 > data = bsize=1024 blocks=131072, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > log =internal log bsize=1024 blocks=2571, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > # mount -o dax /dev/pmem0 /mnt/mem > > The mount command doesn't return any message, and I can successfully read or > write files in /mnt/mem. > xfs will silently disable dax when the fs block size is too small, i.e. your mmap() operations are backed by page cache in this case. Currently the only indication of whether a mapping is DAX backed or not is the presence of the VM_MIXEDMAP flag ("mm" in the VmFlags field of /proc//smaps) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A blocksize problem about dax and ext4
On Wed, Dec 23, 2015 at 09:18:05PM +, Elliott, Robert (Persistent Memory) wrote: > > > -Original Message- > > From: Linux-nvdimm [mailto:linux-nvdimm-boun...@lists.01.org] On Behalf Of > > Dan Williams > > Sent: Wednesday, December 23, 2015 11:16 AM > > To: Cholerae Hu > > Cc: linux-nvd...@lists.01.org > > Subject: Re: A blocksize problem about dax and ext4 > > > > On Wed, Dec 23, 2015 at 4:03 AM, Cholerae Hu > > wrote: > ... > > > [root@localhost cholerae]# mount -o dax /dev/pmem0 /mnt/mem > > > mount: wrong fs type, bad option, bad superblock on /dev/pmem0, > > >missing codepage or helper program, or other error > > > > > >In some cases useful info is found in syslog - try > > >dmesg | tail or so. > > > [root@localhost cholerae]# dmesg | tail > ... > > > [ 81.779582] EXT4-fs (pmem0): error: unsupported blocksize for dax > ... > > > What's the fs block size? For example: > > # dumpe2fs -h /dev/pmem0 | grep "Block size" > > dumpe2fs 1.42.9 (28-Dec-2013) > > Block size: 4096 > > Depending on the size of /dev/pmem0 it may have automatically set it > > to a block size less than 4 KiB which is incompatible with "-o dax". > > I noticed a few things while trying that out on both ext4 and xfs. > > $ sudo mkfs.ext4 -F -b 1024 /dev/pmem0 > $ sudo mount -o dax /dev/pmem0 /mnt/ext4-pmem0 > $ sudo mkfs.xfs -f -b size=1024 /dev/pmem0 > $ sudo mount -o dax /dev/pmem0 /mnt/xfs-pmem0 > > [ 199.679195] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at > your own risk > [ 199.724931] EXT4-fs (pmem0): error: unsupported block size 1024 for dax > [ 859.077766] XFS (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your > own risk > [ 859.118106] XFS (pmem0): Filesystem block size invalid for DAX Turning DAX > off. > [ 859.156950] XFS (pmem0): Mounting V4 Filesystem > [ 859.183626] XFS (pmem0): Ending clean mount > > 1. ext4 fails to mount the filesystem, while xfs just disables DAX. > It seems like they should they be the same. I don't really care what is done to ext4 here, but I'm not changing XFS behaviour. I'm expecting mixed dax/non-dax fileystems to be a thing, with DAX turned on by an inode flag on disk. Indeed, I see the mount option going away permanently for XFS, and DAX being controlled completely from on-disk flags. E.g. ext4 encrypted files need to turn off DAX, while clear text files can be accessed using DAX. This should happen completely transparently to the user In the situation of block size < page size, there's things we can do to ensure that XFS will allocate page size aligned/sized extents (extent size hints FTW). This is the same mechanism that we'll use to ensure that extents are aligned/sized for reliable huge page mappings. Hence while DAX /as a global option/ needs to be turned off for sub-page block size filesystems, there's no reason why we can't turn DAX on for files that will always allocate blocks according to DAX constraints. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: A blocksize problem about dax and ext4
> -Original Message- > From: Linux-nvdimm [mailto:linux-nvdimm-boun...@lists.01.org] On Behalf Of > Dan Williams > Sent: Wednesday, December 23, 2015 11:16 AM > To: Cholerae Hu > Cc: linux-nvd...@lists.01.org > Subject: Re: A blocksize problem about dax and ext4 > > On Wed, Dec 23, 2015 at 4:03 AM, Cholerae Hu > wrote: ... > > [root@localhost cholerae]# mount -o dax /dev/pmem0 /mnt/mem > > mount: wrong fs type, bad option, bad superblock on /dev/pmem0, > >missing codepage or helper program, or other error > > > >In some cases useful info is found in syslog - try > >dmesg | tail or so. > > [root@localhost cholerae]# dmesg | tail ... > > [ 81.779582] EXT4-fs (pmem0): error: unsupported blocksize for dax ... > What's the fs block size? For example: > # dumpe2fs -h /dev/pmem0 | grep "Block size" > dumpe2fs 1.42.9 (28-Dec-2013) > Block size: 4096 > Depending on the size of /dev/pmem0 it may have automatically set it > to a block size less than 4 KiB which is incompatible with "-o dax". I noticed a few things while trying that out on both ext4 and xfs. $ sudo mkfs.ext4 -F -b 1024 /dev/pmem0 $ sudo mount -o dax /dev/pmem0 /mnt/ext4-pmem0 $ sudo mkfs.xfs -f -b size=1024 /dev/pmem0 $ sudo mount -o dax /dev/pmem0 /mnt/xfs-pmem0 [ 199.679195] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk [ 199.724931] EXT4-fs (pmem0): error: unsupported block size 1024 for dax [ 859.077766] XFS (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk [ 859.118106] XFS (pmem0): Filesystem block size invalid for DAX Turning DAX off. [ 859.156950] XFS (pmem0): Mounting V4 Filesystem [ 859.183626] XFS (pmem0): Ending clean mount 1. ext4 fails to mount the filesystem, while xfs just disables DAX. It seems like they should they be the same. 2. if CONFIG_FS_DAX is not supported, ext4 fails to mount, but prints the message at the KERN_INFO level. All the rest of its mount errors use KERN_ERR. Completely unknown mount options are reported like this at the KERN_ERR level: [ 2188.194775] EXT4-fs (pmem0): Unrecognized mount option "xyzzy" or missing value In contrast, if CONFIG_FS_DAX is not supported, then xfs lumps it in with the rest of the unknown mount options, which are reported with xfs_warn(): [ 2347.654182] XFS (pmem0): unknown mount option [xyzzy]. 3. It might be worth printing the problematic filesystem block size (here and in a few other similar messages). I like how xfs' wording of "Filesystem block size" helps distinguish the value from the block device's logical block size. Code excerpts = fs/xfs/xfs_super.c: #ifdef CONFIG_FS_DAX } else if (!strcmp(this_char, MNTOPT_DAX)) { mp->m_flags |= XFS_MOUNT_DAX; #endif ... if (mp->m_flags & XFS_MOUNT_DAX) { xfs_warn(mp, "DAX enabled. Warning: EXPERIMENTAL, use at your own risk"); if (sb->s_blocksize != PAGE_SIZE) { xfs_alert(mp, "Filesystem block size invalid for DAX Turning DAX off."); mp->m_flags &= ~XFS_MOUNT_DAX; } else if (!sb->s_bdev->bd_disk->fops->direct_access) { xfs_alert(mp, "Block device does not support DAX Turning DAX off."); mp->m_flags &= ~XFS_MOUNT_DAX; } } fs/ext4/super.c: } else if (token == Opt_dax) { #ifdef CONFIG_FS_DAX ext4_msg(sb, KERN_WARNING, "DAX enabled. Warning: EXPERIMENTAL, use at your own risk"); sbi->s_mount_opt |= m->mount_opt; #else ext4_msg(sb, KERN_INFO, "dax option not supported"); return -1; #endif --- Robert Elliott, HPE Persistent Memory -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: A blocksize problem about dax and ext4
> -Original Message- > From: Cholerae Hu [mailto:cholerae...@gmail.com] > Sent: Wednesday, December 23, 2015 8:36 PM > Subject: Re: A blocksize problem about dax and ext4 ... > xfs will silently disable dax when the fs block size is too small, > i.e. your mmap() operations are backed by page cache in this case. > Currently the only indication of whether a mapping is DAX backed or > not is the presence of the VM_MIXEDMAP flag ("mm" in the VmFlags field > of /proc//smaps) > > Did you mean that I should make the blocksize bigger until the mount > command tell me that dax is enabled? To really use DAX, the filesystem block size must match the system CPU's page size, which is probably 4096 bytes. --- Robert Elliott, HPE Persistent Memory N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
Re: A blocksize problem about dax and ext4
On Wed, Dec 23, 2015 at 09:18:05PM +, Elliott, Robert (Persistent Memory) wrote: > > > -Original Message- > > From: Linux-nvdimm [mailto:linux-nvdimm-boun...@lists.01.org] On Behalf Of > > Dan Williams > > Sent: Wednesday, December 23, 2015 11:16 AM > > To: Cholerae Hu <cholerae...@gmail.com> > > Cc: linux-nvd...@lists.01.org > > Subject: Re: A blocksize problem about dax and ext4 > > > > On Wed, Dec 23, 2015 at 4:03 AM, Cholerae Hu <cholerae...@gmail.com> > > wrote: > ... > > > [root@localhost cholerae]# mount -o dax /dev/pmem0 /mnt/mem > > > mount: wrong fs type, bad option, bad superblock on /dev/pmem0, > > >missing codepage or helper program, or other error > > > > > >In some cases useful info is found in syslog - try > > >dmesg | tail or so. > > > [root@localhost cholerae]# dmesg | tail > ... > > > [ 81.779582] EXT4-fs (pmem0): error: unsupported blocksize for dax > ... > > > What's the fs block size? For example: > > # dumpe2fs -h /dev/pmem0 | grep "Block size" > > dumpe2fs 1.42.9 (28-Dec-2013) > > Block size: 4096 > > Depending on the size of /dev/pmem0 it may have automatically set it > > to a block size less than 4 KiB which is incompatible with "-o dax". > > I noticed a few things while trying that out on both ext4 and xfs. > > $ sudo mkfs.ext4 -F -b 1024 /dev/pmem0 > $ sudo mount -o dax /dev/pmem0 /mnt/ext4-pmem0 > $ sudo mkfs.xfs -f -b size=1024 /dev/pmem0 > $ sudo mount -o dax /dev/pmem0 /mnt/xfs-pmem0 > > [ 199.679195] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at > your own risk > [ 199.724931] EXT4-fs (pmem0): error: unsupported block size 1024 for dax > [ 859.077766] XFS (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your > own risk > [ 859.118106] XFS (pmem0): Filesystem block size invalid for DAX Turning DAX > off. > [ 859.156950] XFS (pmem0): Mounting V4 Filesystem > [ 859.183626] XFS (pmem0): Ending clean mount > > 1. ext4 fails to mount the filesystem, while xfs just disables DAX. > It seems like they should they be the same. I don't really care what is done to ext4 here, but I'm not changing XFS behaviour. I'm expecting mixed dax/non-dax fileystems to be a thing, with DAX turned on by an inode flag on disk. Indeed, I see the mount option going away permanently for XFS, and DAX being controlled completely from on-disk flags. E.g. ext4 encrypted files need to turn off DAX, while clear text files can be accessed using DAX. This should happen completely transparently to the user In the situation of block size < page size, there's things we can do to ensure that XFS will allocate page size aligned/sized extents (extent size hints FTW). This is the same mechanism that we'll use to ensure that extents are aligned/sized for reliable huge page mappings. Hence while DAX /as a global option/ needs to be turned off for sub-page block size filesystems, there's no reason why we can't turn DAX on for files that will always allocate blocks according to DAX constraints. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A blocksize problem about dax and ext4
On Wed, Dec 23, 2015 at 4:34 PM, Cholerae Huwrote: > The block size is 1024. > # dumpe2fs -h /dev/pmem0 | grep "Block size" > dumpe2fs 1.42.13 (17-May-2015) > Block size: 1024 > > I tried it out on xfs and I succeeded. There are the prompting messages: > # mkfs.xfs -f -b size=1024 /dev/pmem0 > meta-data=/dev/pmem0 isize=512agcount=4, agsize=32768 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=1finobt=1 > data = bsize=1024 blocks=131072, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > log =internal log bsize=1024 blocks=2571, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > # mount -o dax /dev/pmem0 /mnt/mem > > The mount command doesn't return any message, and I can successfully read or > write files in /mnt/mem. > xfs will silently disable dax when the fs block size is too small, i.e. your mmap() operations are backed by page cache in this case. Currently the only indication of whether a mapping is DAX backed or not is the presence of the VM_MIXEDMAP flag ("mm" in the VmFlags field of /proc//smaps) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: A blocksize problem about dax and ext4
> -Original Message- > From: Linux-nvdimm [mailto:linux-nvdimm-boun...@lists.01.org] On Behalf Of > Dan Williams > Sent: Wednesday, December 23, 2015 11:16 AM > To: Cholerae Hu <cholerae...@gmail.com> > Cc: linux-nvd...@lists.01.org > Subject: Re: A blocksize problem about dax and ext4 > > On Wed, Dec 23, 2015 at 4:03 AM, Cholerae Hu <cholerae...@gmail.com> > wrote: ... > > [root@localhost cholerae]# mount -o dax /dev/pmem0 /mnt/mem > > mount: wrong fs type, bad option, bad superblock on /dev/pmem0, > >missing codepage or helper program, or other error > > > >In some cases useful info is found in syslog - try > >dmesg | tail or so. > > [root@localhost cholerae]# dmesg | tail ... > > [ 81.779582] EXT4-fs (pmem0): error: unsupported blocksize for dax ... > What's the fs block size? For example: > # dumpe2fs -h /dev/pmem0 | grep "Block size" > dumpe2fs 1.42.9 (28-Dec-2013) > Block size: 4096 > Depending on the size of /dev/pmem0 it may have automatically set it > to a block size less than 4 KiB which is incompatible with "-o dax". I noticed a few things while trying that out on both ext4 and xfs. $ sudo mkfs.ext4 -F -b 1024 /dev/pmem0 $ sudo mount -o dax /dev/pmem0 /mnt/ext4-pmem0 $ sudo mkfs.xfs -f -b size=1024 /dev/pmem0 $ sudo mount -o dax /dev/pmem0 /mnt/xfs-pmem0 [ 199.679195] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk [ 199.724931] EXT4-fs (pmem0): error: unsupported block size 1024 for dax [ 859.077766] XFS (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk [ 859.118106] XFS (pmem0): Filesystem block size invalid for DAX Turning DAX off. [ 859.156950] XFS (pmem0): Mounting V4 Filesystem [ 859.183626] XFS (pmem0): Ending clean mount 1. ext4 fails to mount the filesystem, while xfs just disables DAX. It seems like they should they be the same. 2. if CONFIG_FS_DAX is not supported, ext4 fails to mount, but prints the message at the KERN_INFO level. All the rest of its mount errors use KERN_ERR. Completely unknown mount options are reported like this at the KERN_ERR level: [ 2188.194775] EXT4-fs (pmem0): Unrecognized mount option "xyzzy" or missing value In contrast, if CONFIG_FS_DAX is not supported, then xfs lumps it in with the rest of the unknown mount options, which are reported with xfs_warn(): [ 2347.654182] XFS (pmem0): unknown mount option [xyzzy]. 3. It might be worth printing the problematic filesystem block size (here and in a few other similar messages). I like how xfs' wording of "Filesystem block size" helps distinguish the value from the block device's logical block size. Code excerpts = fs/xfs/xfs_super.c: #ifdef CONFIG_FS_DAX } else if (!strcmp(this_char, MNTOPT_DAX)) { mp->m_flags |= XFS_MOUNT_DAX; #endif ... if (mp->m_flags & XFS_MOUNT_DAX) { xfs_warn(mp, "DAX enabled. Warning: EXPERIMENTAL, use at your own risk"); if (sb->s_blocksize != PAGE_SIZE) { xfs_alert(mp, "Filesystem block size invalid for DAX Turning DAX off."); mp->m_flags &= ~XFS_MOUNT_DAX; } else if (!sb->s_bdev->bd_disk->fops->direct_access) { xfs_alert(mp, "Block device does not support DAX Turning DAX off."); mp->m_flags &= ~XFS_MOUNT_DAX; } } fs/ext4/super.c: } else if (token == Opt_dax) { #ifdef CONFIG_FS_DAX ext4_msg(sb, KERN_WARNING, "DAX enabled. Warning: EXPERIMENTAL, use at your own risk"); sbi->s_mount_opt |= m->mount_opt; #else ext4_msg(sb, KERN_INFO, "dax option not supported"); return -1; #endif --- Robert Elliott, HPE Persistent Memory -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/