[RFC 12/16] NOVA: Recovery code

2017-08-03 Thread Steven Swanson
Clean umount/mount -- On a clean unmount, Nova saves the contents of many of its DRAM data structures to PMEM to accelerate the next mount: 1. Nova stores the allocator state for each of the per-cpu allocators to the log of a reserved inode (NOVA_BLOCK_NODE_INO). 2. Nova

[RFC 15/16] NOVA: Performance measurement

2017-08-03 Thread Steven Swanson
Signed-off-by: Steven Swanson --- fs/nova/perf.c | 594 fs/nova/perf.h | 96 fs/nova/stats.c | 685 +++ fs/nova/stats.h | 218 ++ 4 files

[RFC 13/16] NOVA: Sysfs and ioctl

2017-08-03 Thread Steven Swanson
Nova provides the normal ioctls for setting file attributes and provides a /proc-based interface for taking snapshots. Signed-off-by: Steven Swanson --- fs/nova/ioctl.c | 185 +++ fs/nova/sysfs.c | 543

Re: [PATCH v2 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Vinod Koul
On Thu, Aug 03, 2017 at 11:06:13AM +0530, Jiang, Dave wrote: > > > > On Aug 2, 2017, at 10:25 PM, Koul, Vinod wrote: > > > >> On Thu, Aug 03, 2017 at 10:41:51AM +0530, Jiang, Dave wrote: > >> > >> > On Aug 2, 2017, at 9:58 PM, Koul, Vinod

[RFC 10/16] NOVA: File data protection

2017-08-03 Thread Steven Swanson
Nova protects data and metadat from corruption due to media errors and scribbles -- software errors in the kernels that may overwrite Nova data. Replication --- Nova replicates all PMEM metadata structures (there are a few exceptions. They are WIP). For structure, there is a primary

[RFC 07/16] NOVA: File and directory operations

2017-08-03 Thread Steven Swanson
To access file data via read(), Nova maintains a radix tree in DRAM for each inode (nova_inode_info_header.tree) that maps file offsets to write log entries. For directories, the same tree maps a hash of filenames to their corresponding dentry. In both cases, the nova populates the tree when the

[RFC 08/16] NOVA: Garbage collection

2017-08-03 Thread Steven Swanson
Nova recovers log space with a two-phase garbage collection system. When a log reaches the end of its allocated pages, Nova allocates more space. Then, the fast GC algorithm scans the log to remove pages that have no valid entries. Then, it estimates how many pages the logs valid entries would

[RFC 09/16] NOVA: DAX code

2017-08-03 Thread Steven Swanson
NOVA leverages the kernel's DAX mechanisms for mmap and file data access. Nova maintains a red-black tree in DRAM (nova_inode_info_header.vma_tree) to track which portions of a file have been mapped. Signed-off-by: Steven Swanson --- fs/nova/dax.c | 1346

[RFC 11/16] NOVA: Snapshot support

2017-08-03 Thread Steven Swanson
Nova supports snapshots to facilitate backups. Taking a snapshot - Each Nova file systems has a current epoch_id in the super block and each log entry has the epoch_id attached to it at creation. When the user creates a snaphot, Nova increments the epoch_id for the file system

[RFC 01/16] NOVA: Documentation

2017-08-03 Thread Steven Swanson
A brief overview is in README.md. Implementation and usage details are in Documentation/filesystems/nova.txt. These two papers provide a detailed, high-level description of NOVA's design goals and approach: NOVA: A Log-structured File system for Hybrid Volatile/Non-volatile Main Memories

[RFC 14/16] NOVA: Read-only pmem devices

2017-08-03 Thread Steven Swanson
Add (and implement) a module command line option to nd_pmem to support read-only pmem devices. Signed-off-by: Steven Swanson --- arch/x86/include/asm/io.h |1 + arch/x86/mm/ioremap.c | 25 ++--- drivers/nvdimm/pmem.c | 14 --

Re: [PATCH 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Johannes Thumshirn
On Tue, Aug 01, 2017 at 10:43:30AM -0700, Dan Williams wrote: > On Tue, Aug 1, 2017 at 12:34 AM, Johannes Thumshirn > wrote: > > Dave Jiang writes: > > > >> Adding DMA support for pmem blk reads. This provides signficant CPU > >> reduction with large

Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

2017-08-03 Thread Christoph Hellwig
FYI, for the read side we should use the on-stack bio unconditionally, as it will always be a win (or not show up at all). ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

[RFC 02/16] NOVA: Superblock and fs layout

2017-08-03 Thread Steven Swanson
FS Layout == A Nova file systems resides in single PMEM device. Nova divides the device into 4KB blocks that are arrange like so: block +-+ | 0 | primary super block (struct nova_super_block) |

[RFC 04/16] NOVA: Inode operations and structures

2017-08-03 Thread Steven Swanson
Nova maintains per-CPU inode tables, and inode numbers are striped across the tables (i.e., inos 0, n, 2n,... on cpu 0; inos 1, n + 1, 2n + 1, ... on cpu 1). The inodes themselves live in a set of linked lists (one per CPU) of 2MB blocks. The last 8 bytes of each block points to the next block.

[RFC 05/16] NOVA: Log data structures and operations

2017-08-03 Thread Steven Swanson
Nova maintains a log for each inode that records updates to the inode's metadata and holds pointers to the file data. Nova makes updates to file data and metadata atomic by atomically appending log entries to the log. Each inode contains pointers to head and tail of the inode's log. When the

[RFC 06/16] NOVA: Lite-weight journaling for complex ops

2017-08-03 Thread Steven Swanson
Nova uses a lightweight journaling mechanisms to provide atomicity for operations that modify more than one on inode. The journals providing logging for two operations: 1. Single word updates (JOURNAL_ENTRY) 2. Copying inodes (JOURNAL_INODE) The journals are undo logs: Nova creates the

[RFC 03/16] NOVA: PMEM allocation system

2017-08-03 Thread Steven Swanson
Nova uses per-CPU allocators to manage free PMEM blocks. On initialization, NOVA divides the range of blocks in the PMEM device among the CPUs, and those blocks are managed solely by that CPU. We call these ranges of allocation regions. Some of the blocks in an allocation region have fixed

[RFC 00/16] NOVA: a new file system for persistent memory

2017-08-03 Thread Steven Swanson
This is an RFC patch series that impements NOVA (NOn-Volatile memory Accelerated file system), a new file system built for PMEM. NOVA's goal is to provide a high-performance, full-featured, production-ready file system tailored for byte-addressable non-volatile memories (e.g., NVDIMMs and Intel's

Re: [PATCH 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Dan Williams
On Thu, Aug 3, 2017 at 1:06 AM, Johannes Thumshirn wrote: > On Tue, Aug 01, 2017 at 10:43:30AM -0700, Dan Williams wrote: >> On Tue, Aug 1, 2017 at 12:34 AM, Johannes Thumshirn >> wrote: >> > Dave Jiang writes: >> > >> >> Adding DMA

Re: [PATCH 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Dan Williams
On Thu, Aug 3, 2017 at 9:12 AM, Dave Jiang wrote: > > > On 08/03/2017 08:41 AM, Dan Williams wrote: >> On Thu, Aug 3, 2017 at 1:06 AM, Johannes Thumshirn >> wrote: >>> On Tue, Aug 01, 2017 at 10:43:30AM -0700, Dan Williams wrote: On Tue, Aug 1,

Re: [PATCH 1/3] btt: remove btt_rw_page()

2017-08-03 Thread kbuild test robot
Hi Ross, [auto build test WARNING on linux-nvdimm/libnvdimm-for-next] [also build test WARNING on v4.13-rc3] [cannot apply to next-20170803] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Ross

Re: [PATCH v2 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Jiang, Dave
> On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote: > >> On Thu, Aug 03, 2017 at 11:06:13AM +0530, Jiang, Dave wrote: >> >> On Aug 2, 2017, at 10:25 PM, Koul, Vinod wrote: On Thu, Aug 03, 2017 at 10:41:51AM +0530, Jiang, Dave wrote:

Re: [PATCH v2 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Vinod Koul
On Thu, Aug 03, 2017 at 08:06:07PM +0530, Jiang, Dave wrote: > > > > On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote: > > > >> On Thu, Aug 03, 2017 at 11:06:13AM +0530, Jiang, Dave wrote: > >> > >> > On Aug 2, 2017, at 10:25 PM, Koul, Vinod

Re: [PATCH v2 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Dan Williams
On Thu, Aug 3, 2017 at 8:55 AM, Vinod Koul wrote: > On Thu, Aug 03, 2017 at 08:06:07PM +0530, Jiang, Dave wrote: >> >> >> > On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote: >> > >> >> On Thu, Aug 03, 2017 at 11:06:13AM +0530, Jiang, Dave wrote: >> >> >>

Re: [PATCH 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Dave Jiang
On 08/03/2017 08:41 AM, Dan Williams wrote: > On Thu, Aug 3, 2017 at 1:06 AM, Johannes Thumshirn wrote: >> On Tue, Aug 01, 2017 at 10:43:30AM -0700, Dan Williams wrote: >>> On Tue, Aug 1, 2017 at 12:34 AM, Johannes Thumshirn >>> wrote: Dave Jiang

RE: [PATCH v2 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Allen Hubbe
From: Dave Jiang > On 08/03/2017 09:14 AM, Dan Williams wrote: > > On Thu, Aug 3, 2017 at 8:55 AM, Vinod Koul wrote: > >> On Thu, Aug 03, 2017 at 08:06:07PM +0530, Jiang, Dave wrote: > On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote: > > On Thu,

国内最规范的薪酬设计方法

2017-08-03 Thread 宋女士
___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

Re: [PATCH v2 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Dave Jiang
On 08/03/2017 09:14 AM, Dan Williams wrote: > On Thu, Aug 3, 2017 at 8:55 AM, Vinod Koul wrote: >> On Thu, Aug 03, 2017 at 08:06:07PM +0530, Jiang, Dave wrote: >>> >>> On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote: > On Thu, Aug 03,

Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

2017-08-03 Thread Jens Axboe
On 08/03/2017 03:13 PM, Ross Zwisler wrote: > On Thu, Aug 03, 2017 at 09:13:15AM +0900, Minchan Kim wrote: >> Hi Ross, >> >> On Wed, Aug 02, 2017 at 04:13:59PM -0600, Ross Zwisler wrote: >>> On Fri, Jul 28, 2017 at 10:31:43AM -0700, Matthew Wilcox wrote: On Fri, Jul 28, 2017 at 10:56:01AM

Re: [PATCH v2 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Ross Zwisler
On Wed, Aug 02, 2017 at 11:41:25AM -0700, Dave Jiang wrote: > Adding DMA support for pmem blk reads. This provides signficant CPU > reduction with large memory reads with good performance. DMAs are triggered > with test against bio_multiple_segment(), so the small I/Os (4k or less?) > are still

Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

2017-08-03 Thread Ross Zwisler
On Thu, Aug 03, 2017 at 09:13:15AM +0900, Minchan Kim wrote: > Hi Ross, > > On Wed, Aug 02, 2017 at 04:13:59PM -0600, Ross Zwisler wrote: > > On Fri, Jul 28, 2017 at 10:31:43AM -0700, Matthew Wilcox wrote: > > > On Fri, Jul 28, 2017 at 10:56:01AM -0600, Ross Zwisler wrote: > > > > Dan Williams

[PATCH] ndctl, util: in util_namespace_to_json, initialize bb_count to 0

2017-08-03 Thread Vishal Verma
If in the util_namespace_to_json -> dev_badblocks_to_json call chain, if anything errors out before setting bb_count, bb_count will have an uninitialized, random value. This happens during namespace creation, as the namespaceX.Y/resource is not available (as the namespace is disabled), and the

Re: [PATCH v2 4/5] libnvdimm: Adding blk-mq support to the pmem driver

2017-08-03 Thread Ross Zwisler
On Wed, Aug 02, 2017 at 11:41:19AM -0700, Dave Jiang wrote: > Adding blk-mq support to the pmem driver in addition to the direct bio > support. This allows for hardware offloading via DMA engines. By default > the bio method will be enabled. The blk-mq support can be turned on via > module

[PATCH v2 3/5] fs, xfs: introduce FALLOC_FL_UNSEAL_BLOCK_MAP

2017-08-03 Thread Dan Williams
Provide an explicit fallocate operation type for clearing the S_IOMAP_IMMUTABLE flag. Like the enable case it requires CAP_IMMUTABLE and it can only be performed while no process has the file mapped. Cc: Jan Kara Cc: Jeff Moyer Cc: Christoph Hellwig

[PATCH v2 5/5] xfs: toggle XFS_DIFLAG2_IOMAP_IMMUTABLE in response to fallocate

2017-08-03 Thread Dan Williams
After validating the state of the file as not having holes, shared extents, or active mappings try to commit the XFS_DIFLAG2_IOMAP_IMMUTABLE flag to the on-disk inode metadata. If that succeeds then allow the S_IOMAP_IMMUTABLE to be set on the vfs inode. Cc: Jan Kara Cc: Jeff Moyer

Re: [RFC 01/16] NOVA: Documentation

2017-08-03 Thread Randy Dunlap
On 08/03/2017 12:48 AM, Steven Swanson wrote: > A brief overview is in README.md. > See below. > Implementation and usage details are in Documentation/filesystems/nova.txt. > Reviewed in a separate email. > These two papers provide a detailed, high-level description of NOVA's design > goals

[PATCH v2 0/5] fs, xfs: block map immutable files for dax, dma-to-storage, and swap

2017-08-03 Thread Dan Williams
Changes since v1 [1]: * Add IS_IOMAP_IMMUTABLE() checks to xfs ioctl paths that perform block map changes (xfs_alloc_file_space and xfs_free_file_space) (Darrick) * Rather than complete a partial write, fail all writes that would attempt to extend the file size (Darrick) * Introduce

[PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-03 Thread Dan Williams
>From falloc.h: FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the file logical-to-physical extent offset mappings in the file. The purpose is to allow an application to assume that there are no holes or shared extents in the file and that the metadata needed to

[PATCH v2 1/5] fs, xfs: introduce S_IOMAP_IMMUTABLE

2017-08-03 Thread Dan Williams
An inode with this flag set indicates that the file's block map cannot be changed from the currently allocated set. The implementation of toggling the flag and sealing the state of the extent map is saved for a later patch. The functionality provided by S_IOMAP_IMMUTABLE, once toggle support is

[PATCH v2 4/5] xfs: introduce XFS_DIFLAG2_IOMAP_IMMUTABLE

2017-08-03 Thread Dan Williams
Add an on-disk inode flag to record the state of the S_IOMAP_IMMUTABLE in-memory vfs inode flags. This allows the protections against reflink and hole punch to be automatically restored on a sub-sequent boot when the in-memory inode is established. The FS_XFLAG_IOMAP_IMMUTABLE is introduced to

Re: [PATCH v2 0/5] fs, xfs: block map immutable files for dax, dma-to-storage, and swap

2017-08-03 Thread Dan Williams
[ adding linux-api to the cover letter for notification, will send the full set to linux-api for v3 ] On Thu, Aug 3, 2017 at 7:28 PM, Dan Williams wrote: > Changes since v1 [1]: > * Add IS_IOMAP_IMMUTABLE() checks to xfs ioctl paths that perform block > map changes

[PATCH v3 2/8] dmaengine: change transaction type DMA_SG to DMA_SG_SG

2017-08-03 Thread Dave Jiang
In preparation of adding an API to perform SG to/from buffer for dmaengine, we will change DMA_SG to DMA_SG_SG in order to explicitly making clear what this op type is for. Signed-off-by: Dave Jiang --- Documentation/dmaengine/provider.txt |2 +-

[PATCH v3 5/8] dmaengine: ioatdma: dma_prep_memcpy_sg support

2017-08-03 Thread Dave Jiang
Adding ioatdma support to copy from a physically contiguous buffer to a provided scatterlist and vice versa. This is used to support reading/writing persistent memory in the pmem driver. Signed-off-by: Dave Jiang --- drivers/dma/ioat/dma.h |4 +++

[PATCH v3 1/8] dmaengine: ioatdma: revert 7618d035 to allow sharing of DMA channels

2017-08-03 Thread Dave Jiang
Commit 7618d0359c16 ("dmaengine: ioatdma: Set non RAID channels to be private capable") makes all non-RAID ioatdma channels as private to be requestable by dma_request_channel(). With PQ CAP support going away for ioatdma, this would make all channels private. To support the usage of ioatdma for

[4.9-stable PATCH] device-dax: fix sysfs duplicate warnings

2017-08-03 Thread Dan Williams
commit bbb3be170ac2891526ad07b18af7db226879a8e7 upstream. Fix warnings of the form... WARNING: CPU: 10 PID: 4983 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80 sysfs: cannot create duplicate filename '/class/dax/dax12.0' Call Trace: dump_stack+0x63/0x86 __warn+0xcb/0xf0

Re: [PATCH] ndctl, util: in util_namespace_to_json, initialize bb_count to 0

2017-08-03 Thread Dan Williams
On Thu, Aug 3, 2017 at 2:18 PM, Vishal Verma wrote: > If in the util_namespace_to_json -> dev_badblocks_to_json call chain, if > anything errors out before setting bb_count, bb_count will have an > uninitialized, random value. > > This happens during namespace creation,

Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

2017-08-03 Thread Minchan Kim
On Thu, Aug 03, 2017 at 10:05:44AM +0200, Christoph Hellwig wrote: > FYI, for the read side we should use the on-stack bio unconditionally, > as it will always be a win (or not show up at all). Think about readahead. Unconditional on-stack bio to read around pages with faulted address will cause

[PATCH v3 8/8] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Dave Jiang
Adding DMA support for pmem blk reads. This provides signficant CPU reduction with large memory reads with good performance. DMAs are triggered with test against bio_multiple_segment(), so the small I/Os (4k or less?) are still performed by the CPU in order to reduce latency. By default the pmem

[PATCH v3 6/8] dmaengine: add SG support to dmaengine_unmap

2017-08-03 Thread Dave Jiang
This should provide support to unmap scatterlist with the dmaengine_unmap_data. We will support only 1 scatterlist per direction. The DMA addresses array has been overloaded for the 2 or less entries DMA unmap data structure in order to store the SG pointer(s). Signed-off-by: Dave Jiang

[PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver

2017-08-03 Thread Dave Jiang
v3: - Added patch to rename DMA_SG to DMA_SG_SG to make it explicit - Added DMA_MEMCPY_SG transaction type to dmaengine - Misc patch to add verification of DMA_MEMSET_SG that was missing - Addressed all nd_pmem driver comments from Ross. v2: - Make dma_prep_memcpy_* into one function per Dan. -

[PATCH v3 7/8] libnvdimm: Adding blk-mq support to the pmem driver

2017-08-03 Thread Dave Jiang
Adding blk-mq support to the pmem driver in addition to the direct bio support. This allows for hardware offloading via DMA engines. By default the bio method will be enabled. The blk-mq support can be turned on via module parameter queue_mode=1. Signed-off-by: Dave Jiang

[PATCH v3 3/8] dmaengine: Add DMA_MEMCPY_SG transaction op

2017-08-03 Thread Dave Jiang
Adding a dmaengine transaction operation that allows copy to/from a scatterlist and a flat buffer. Signed-off-by: Dave Jiang --- Documentation/dmaengine/provider.txt |3 +++ drivers/dma/dmaengine.c |2 ++ include/linux/dmaengine.h|6

[PATCH v3 4/8] dmaengine: add verification of DMA_MEMSET_SG in dmaengine

2017-08-03 Thread Dave Jiang
DMA_MEMSET_SG is missing the verification of having the operation set and also a supporting function provided. Fixes: Commit 50c7cd2bd ("dmaengine: Add scatter-gathered memset") Signed-off-by: Dave Jiang --- drivers/dma/dmaengine.c |2 ++ 1 file changed, 2