On Mon, Jan 30, 2017 at 4:32 AM, Christoph Hellwig <h...@lst.de> wrote:
> On Sat, Jan 28, 2017 at 12:36:58AM -0800, Dan Williams wrote:
>> Provide a replacement for bdev_direct_access() that uses
>> dax_operations.direct_access() instead of
>> block_device_operations.direct_access(). Once all consumers of the old
>> api have been converted bdev_direct_access() will be deleted.
>>
>> Given that block device partitioning decisions can cause dax page
>> alignment constraints to be violated we still need to validate the
>> block_device before calling the dax ->direct_access method.
>>
>> Signed-off-by: Dan Williams <dan.j.willi...@intel.com>
>> ---
>>  block/Kconfig          |    1 +
>>  drivers/dax/super.c    |   33 +++++++++++++++++++++++++++++++++
>>  fs/block_dev.c         |   28 ++++++++++++++++++++++++++++
>>  include/linux/blkdev.h |    3 +++
>>  include/linux/dax.h    |    2 ++
>>  5 files changed, 67 insertions(+)
>>
>> diff --git a/block/Kconfig b/block/Kconfig
>> index 8bf114a3858a..9be785173280 100644
>> --- a/block/Kconfig
>> +++ b/block/Kconfig
>> @@ -6,6 +6,7 @@ menuconfig BLOCK
>>         default y
>>         select SBITMAP
>>         select SRCU
>> +       select DAX
>>         help
>>        Provide block layer support for the kernel.
>>
>> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
>> index eb844ffea3cf..ab5b082df5dd 100644
>> --- a/drivers/dax/super.c
>> +++ b/drivers/dax/super.c
>> @@ -65,6 +65,39 @@ struct dax_inode {
>>       const struct dax_operations *ops;
>>  };
>>
>> +long dax_direct_access(struct dax_inode *dax_inode, phys_addr_t dev_addr,
>> +             void **kaddr, pfn_t *pfn, long size)
>> +{
>> +     long avail;
>> +
>> +     /*
>> +      * The device driver is allowed to sleep, in order to make the
>> +      * memory directly accessible.
>> +      */
>> +     might_sleep();
>> +
>> +     if (!dax_inode)
>> +             return -EOPNOTSUPP;
>> +
>> +     if (!dax_inode_alive(dax_inode))
>> +             return -ENXIO;
>> +
>> +     if (size < 0)
>> +             return size;
>> +
>> +     if (dev_addr % PAGE_SIZE)
>> +             return -EINVAL;
>> +
>> +     avail = dax_inode->ops->direct_access(dax_inode, dev_addr, kaddr, pfn,
>> +                     size);
>> +     if (!avail)
>> +             return -ERANGE;
>> +     if (avail > 0 && avail & ~PAGE_MASK)
>> +             return -ENXIO;
>> +     return min(avail, size);
>> +}
>> +EXPORT_SYMBOL_GPL(dax_direct_access);
>> +
>>  bool dax_inode_alive(struct dax_inode *dax_inode)
>>  {
>>       lockdep_assert_held(&dax_srcu);
>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>> index edb1d2b16b8f..bf4b51a3a412 100644
>> --- a/fs/block_dev.c
>> +++ b/fs/block_dev.c
>> @@ -18,6 +18,7 @@
>>  #include <linux/module.h>
>>  #include <linux/blkpg.h>
>>  #include <linux/magic.h>
>> +#include <linux/dax.h>
>>  #include <linux/buffer_head.h>
>>  #include <linux/swap.h>
>>  #include <linux/pagevec.h>
>> @@ -763,6 +764,33 @@ long bdev_direct_access(struct block_device *bdev, 
>> struct blk_dax_ctl *dax)
>>  EXPORT_SYMBOL_GPL(bdev_direct_access);
>>
>>  /**
>> + * bdev_dax_direct_access() - bdev-sector to pfn_t and kernel virtual 
>> address
>> + * @bdev: host block device for @dax_inode
>> + * @dax_inode: interface data and operations for a memory device
>> + * @dax: control and output parameters for ->direct_access
>> + *
>> + * Return: negative errno if an error occurs, otherwise the number of bytes
>> + * accessible at this address.
>> + *
>> + * Locking: must be called with dax_read_lock() held
>> + */
>> +long bdev_dax_direct_access(struct block_device *bdev,
>> +             struct dax_inode *dax_inode, struct blk_dax_ctl *dax)
>> +{
>> +     sector_t sector = dax->sector;
>> +
>> +     if (!blk_queue_dax(bdev->bd_queue))
>> +             return -EOPNOTSUPP;
>
> I don't think this should take a bdev - the caller should know if
> it has a dax_inode.  Also if you touch this anyway can we kill
> the annoying struct blk_dax_ctl calling convention?  Passing the
> four arguments explicitly is just a lot more readable and understandable.

Ok, now that dax_map_atomic() is gone, it's much easier to remove
struct blk_dax_ctl.

We can also move the partition alignment checks to be a one-time check
at bdev_dax_capable() time and kill bdev_dax_direct_access() in favor
of calling dax_direct_access() directly.

>> +     if ((sector + DIV_ROUND_UP(dax->size, 512))
>> +                     > part_nr_sects_read(bdev->bd_part))
>> +             return -ERANGE;
>> +     sector += get_start_sect(bdev);
>> +     return dax_direct_access(dax_inode, sector * 512, &dax->addr,
>> +                     &dax->pfn, dax->size);
>
> And please switch to using bytes as the granularity given that we're
> deadling with byte addressable memory.

dax_direct_access() does take a byte aligned physical address, but it
needs to be at least page aligned since we are returning a pfn_t...

Hmm, perhaps the input should be raw page frame number. We could
reduce one of the arguments by making the current 'pfn_t *' parameter
an in/out-parameter.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to