Clean umount/mount
--
On a clean unmount, Nova saves the contents of many of its DRAM data structures
to PMEM to accelerate the next mount:
1. Nova stores the allocator state for each of the per-cpu allocators to the
log of a reserved inode (NOVA_BLOCK_NODE_INO).
2. Nova
Signed-off-by: Steven Swanson
---
fs/nova/perf.c | 594
fs/nova/perf.h | 96
fs/nova/stats.c | 685 +++
fs/nova/stats.h | 218 ++
4 files
Nova provides the normal ioctls for setting file attributes and provides a
/proc-based interface for taking snapshots.
Signed-off-by: Steven Swanson
---
fs/nova/ioctl.c | 185 +++
fs/nova/sysfs.c | 543
On Thu, Aug 03, 2017 at 11:06:13AM +0530, Jiang, Dave wrote:
>
>
> > On Aug 2, 2017, at 10:25 PM, Koul, Vinod wrote:
> >
> >> On Thu, Aug 03, 2017 at 10:41:51AM +0530, Jiang, Dave wrote:
> >>
> >>
> On Aug 2, 2017, at 9:58 PM, Koul, Vinod
Nova protects data and metadat from corruption due to media errors and
scribbles -- software errors in the kernels that may overwrite Nova data.
Replication
---
Nova replicates all PMEM metadata structures (there are a few exceptions. They
are WIP). For structure, there is a primary
To access file data via read(), Nova maintains a radix tree in DRAM for each
inode (nova_inode_info_header.tree) that maps file offsets to write log
entries. For directories, the same tree maps a hash of filenames to their
corresponding dentry.
In both cases, the nova populates the tree when the
Nova recovers log space with a two-phase garbage collection system. When a log
reaches the end of its allocated pages, Nova allocates more space. Then, the
fast GC algorithm scans the log to remove pages that have no valid entries.
Then, it estimates how many pages the logs valid entries would
NOVA leverages the kernel's DAX mechanisms for mmap and file data access. Nova
maintains a red-black tree in DRAM (nova_inode_info_header.vma_tree) to track
which portions of a file have been mapped.
Signed-off-by: Steven Swanson
---
fs/nova/dax.c | 1346
Nova supports snapshots to facilitate backups.
Taking a snapshot
-
Each Nova file systems has a current epoch_id in the super block and each log
entry has the epoch_id attached to it at creation. When the user creates a
snaphot, Nova increments the epoch_id for the file system
A brief overview is in README.md.
Implementation and usage details are in Documentation/filesystems/nova.txt.
These two papers provide a detailed, high-level description of NOVA's design
goals and approach:
NOVA: A Log-structured File system for Hybrid Volatile/Non-volatile Main
Memories
Add (and implement) a module command line option to nd_pmem to support
read-only pmem devices.
Signed-off-by: Steven Swanson
---
arch/x86/include/asm/io.h |1 +
arch/x86/mm/ioremap.c | 25 ++---
drivers/nvdimm/pmem.c | 14 --
On Tue, Aug 01, 2017 at 10:43:30AM -0700, Dan Williams wrote:
> On Tue, Aug 1, 2017 at 12:34 AM, Johannes Thumshirn
> wrote:
> > Dave Jiang writes:
> >
> >> Adding DMA support for pmem blk reads. This provides signficant CPU
> >> reduction with large
FYI, for the read side we should use the on-stack bio unconditionally,
as it will always be a win (or not show up at all).
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
FS Layout
==
A Nova file systems resides in single PMEM device. Nova divides the device into
4KB blocks that are arrange like so:
block
+-+
| 0 | primary super block (struct nova_super_block) |
Nova maintains per-CPU inode tables, and inode numbers are striped across the
tables (i.e., inos 0, n, 2n,... on cpu 0; inos 1, n + 1, 2n + 1, ... on cpu 1).
The inodes themselves live in a set of linked lists (one per CPU) of 2MB
blocks. The last 8 bytes of each block points to the next block.
Nova maintains a log for each inode that records updates to the inode's
metadata and holds pointers to the file data. Nova makes updates to file data
and metadata atomic by atomically appending log entries to the log.
Each inode contains pointers to head and tail of the inode's log. When the
Nova uses a lightweight journaling mechanisms to provide atomicity for
operations that modify more than one on inode. The journals providing logging
for two operations:
1. Single word updates (JOURNAL_ENTRY)
2. Copying inodes (JOURNAL_INODE)
The journals are undo logs: Nova creates the
Nova uses per-CPU allocators to manage free PMEM blocks. On initialization,
NOVA divides the range of blocks in the PMEM device among the CPUs, and those
blocks are managed solely by that CPU. We call these ranges of allocation
regions.
Some of the blocks in an allocation region have fixed
This is an RFC patch series that impements NOVA (NOn-Volatile memory
Accelerated file system), a new file system built for PMEM.
NOVA's goal is to provide a high-performance, full-featured, production-ready
file system tailored for byte-addressable non-volatile memories (e.g., NVDIMMs
and Intel's
On Thu, Aug 3, 2017 at 1:06 AM, Johannes Thumshirn wrote:
> On Tue, Aug 01, 2017 at 10:43:30AM -0700, Dan Williams wrote:
>> On Tue, Aug 1, 2017 at 12:34 AM, Johannes Thumshirn
>> wrote:
>> > Dave Jiang writes:
>> >
>> >> Adding DMA
On Thu, Aug 3, 2017 at 9:12 AM, Dave Jiang wrote:
>
>
> On 08/03/2017 08:41 AM, Dan Williams wrote:
>> On Thu, Aug 3, 2017 at 1:06 AM, Johannes Thumshirn
>> wrote:
>>> On Tue, Aug 01, 2017 at 10:43:30AM -0700, Dan Williams wrote:
On Tue, Aug 1,
Hi Ross,
[auto build test WARNING on linux-nvdimm/libnvdimm-for-next]
[also build test WARNING on v4.13-rc3]
[cannot apply to next-20170803]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system]
url:
https://github.com/0day-ci/linux/commits/Ross
> On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote:
>
>> On Thu, Aug 03, 2017 at 11:06:13AM +0530, Jiang, Dave wrote:
>>
>>
On Aug 2, 2017, at 10:25 PM, Koul, Vinod wrote:
On Thu, Aug 03, 2017 at 10:41:51AM +0530, Jiang, Dave wrote:
On Thu, Aug 03, 2017 at 08:06:07PM +0530, Jiang, Dave wrote:
>
>
> > On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote:
> >
> >> On Thu, Aug 03, 2017 at 11:06:13AM +0530, Jiang, Dave wrote:
> >>
> >>
> On Aug 2, 2017, at 10:25 PM, Koul, Vinod
On Thu, Aug 3, 2017 at 8:55 AM, Vinod Koul wrote:
> On Thu, Aug 03, 2017 at 08:06:07PM +0530, Jiang, Dave wrote:
>>
>>
>> > On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote:
>> >
>> >> On Thu, Aug 03, 2017 at 11:06:13AM +0530, Jiang, Dave wrote:
>> >>
>>
On 08/03/2017 08:41 AM, Dan Williams wrote:
> On Thu, Aug 3, 2017 at 1:06 AM, Johannes Thumshirn wrote:
>> On Tue, Aug 01, 2017 at 10:43:30AM -0700, Dan Williams wrote:
>>> On Tue, Aug 1, 2017 at 12:34 AM, Johannes Thumshirn
>>> wrote:
Dave Jiang
From: Dave Jiang
> On 08/03/2017 09:14 AM, Dan Williams wrote:
> > On Thu, Aug 3, 2017 at 8:55 AM, Vinod Koul wrote:
> >> On Thu, Aug 03, 2017 at 08:06:07PM +0530, Jiang, Dave wrote:
> On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote:
> > On Thu,
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
On 08/03/2017 09:14 AM, Dan Williams wrote:
> On Thu, Aug 3, 2017 at 8:55 AM, Vinod Koul wrote:
>> On Thu, Aug 03, 2017 at 08:06:07PM +0530, Jiang, Dave wrote:
>>>
>>>
On Aug 3, 2017, at 1:56 AM, Koul, Vinod wrote:
> On Thu, Aug 03,
On 08/03/2017 03:13 PM, Ross Zwisler wrote:
> On Thu, Aug 03, 2017 at 09:13:15AM +0900, Minchan Kim wrote:
>> Hi Ross,
>>
>> On Wed, Aug 02, 2017 at 04:13:59PM -0600, Ross Zwisler wrote:
>>> On Fri, Jul 28, 2017 at 10:31:43AM -0700, Matthew Wilcox wrote:
On Fri, Jul 28, 2017 at 10:56:01AM
On Wed, Aug 02, 2017 at 11:41:25AM -0700, Dave Jiang wrote:
> Adding DMA support for pmem blk reads. This provides signficant CPU
> reduction with large memory reads with good performance. DMAs are triggered
> with test against bio_multiple_segment(), so the small I/Os (4k or less?)
> are still
On Thu, Aug 03, 2017 at 09:13:15AM +0900, Minchan Kim wrote:
> Hi Ross,
>
> On Wed, Aug 02, 2017 at 04:13:59PM -0600, Ross Zwisler wrote:
> > On Fri, Jul 28, 2017 at 10:31:43AM -0700, Matthew Wilcox wrote:
> > > On Fri, Jul 28, 2017 at 10:56:01AM -0600, Ross Zwisler wrote:
> > > > Dan Williams
If in the util_namespace_to_json -> dev_badblocks_to_json call chain, if
anything errors out before setting bb_count, bb_count will have an
uninitialized, random value.
This happens during namespace creation, as the namespaceX.Y/resource is
not available (as the namespace is disabled), and the
On Wed, Aug 02, 2017 at 11:41:19AM -0700, Dave Jiang wrote:
> Adding blk-mq support to the pmem driver in addition to the direct bio
> support. This allows for hardware offloading via DMA engines. By default
> the bio method will be enabled. The blk-mq support can be turned on via
> module
Provide an explicit fallocate operation type for clearing the
S_IOMAP_IMMUTABLE flag. Like the enable case it requires CAP_IMMUTABLE
and it can only be performed while no process has the file mapped.
Cc: Jan Kara
Cc: Jeff Moyer
Cc: Christoph Hellwig
After validating the state of the file as not having holes, shared
extents, or active mappings try to commit the
XFS_DIFLAG2_IOMAP_IMMUTABLE flag to the on-disk inode metadata. If that
succeeds then allow the S_IOMAP_IMMUTABLE to be set on the vfs inode.
Cc: Jan Kara
Cc: Jeff Moyer
On 08/03/2017 12:48 AM, Steven Swanson wrote:
> A brief overview is in README.md.
>
See below.
> Implementation and usage details are in Documentation/filesystems/nova.txt.
>
Reviewed in a separate email.
> These two papers provide a detailed, high-level description of NOVA's design
> goals
Changes since v1 [1]:
* Add IS_IOMAP_IMMUTABLE() checks to xfs ioctl paths that perform block
map changes (xfs_alloc_file_space and xfs_free_file_space) (Darrick)
* Rather than complete a partial write, fail all writes that would
attempt to extend the file size (Darrick)
* Introduce
>From falloc.h:
FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
file logical-to-physical extent offset mappings in the file. The
purpose is to allow an application to assume that there are no holes
or shared extents in the file and that the metadata needed to
An inode with this flag set indicates that the file's block map cannot
be changed from the currently allocated set.
The implementation of toggling the flag and sealing the state of the
extent map is saved for a later patch. The functionality provided by
S_IOMAP_IMMUTABLE, once toggle support is
Add an on-disk inode flag to record the state of the S_IOMAP_IMMUTABLE
in-memory vfs inode flags. This allows the protections against reflink
and hole punch to be automatically restored on a sub-sequent boot when
the in-memory inode is established.
The FS_XFLAG_IOMAP_IMMUTABLE is introduced to
[ adding linux-api to the cover letter for notification, will send the
full set to linux-api for v3 ]
On Thu, Aug 3, 2017 at 7:28 PM, Dan Williams wrote:
> Changes since v1 [1]:
> * Add IS_IOMAP_IMMUTABLE() checks to xfs ioctl paths that perform block
> map changes
In preparation of adding an API to perform SG to/from buffer for dmaengine,
we will change DMA_SG to DMA_SG_SG in order to explicitly making clear what
this op type is for.
Signed-off-by: Dave Jiang
---
Documentation/dmaengine/provider.txt |2 +-
Adding ioatdma support to copy from a physically contiguous buffer to a
provided scatterlist and vice versa. This is used to support
reading/writing persistent memory in the pmem driver.
Signed-off-by: Dave Jiang
---
drivers/dma/ioat/dma.h |4 +++
Commit 7618d0359c16 ("dmaengine: ioatdma: Set non RAID channels to be
private capable") makes all non-RAID ioatdma channels as private to be
requestable by dma_request_channel(). With PQ CAP support going away for
ioatdma, this would make all channels private. To support the usage of
ioatdma for
commit bbb3be170ac2891526ad07b18af7db226879a8e7 upstream.
Fix warnings of the form...
WARNING: CPU: 10 PID: 4983 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
sysfs: cannot create duplicate filename '/class/dax/dax12.0'
Call Trace:
dump_stack+0x63/0x86
__warn+0xcb/0xf0
On Thu, Aug 3, 2017 at 2:18 PM, Vishal Verma wrote:
> If in the util_namespace_to_json -> dev_badblocks_to_json call chain, if
> anything errors out before setting bb_count, bb_count will have an
> uninitialized, random value.
>
> This happens during namespace creation,
On Thu, Aug 03, 2017 at 10:05:44AM +0200, Christoph Hellwig wrote:
> FYI, for the read side we should use the on-stack bio unconditionally,
> as it will always be a win (or not show up at all).
Think about readahead. Unconditional on-stack bio to read around pages
with faulted address will cause
Adding DMA support for pmem blk reads. This provides signficant CPU
reduction with large memory reads with good performance. DMAs are triggered
with test against bio_multiple_segment(), so the small I/Os (4k or less?)
are still performed by the CPU in order to reduce latency. By default
the pmem
This should provide support to unmap scatterlist with the
dmaengine_unmap_data. We will support only 1 scatterlist per
direction. The DMA addresses array has been overloaded for the
2 or less entries DMA unmap data structure in order to store the
SG pointer(s).
Signed-off-by: Dave Jiang
v3:
- Added patch to rename DMA_SG to DMA_SG_SG to make it explicit
- Added DMA_MEMCPY_SG transaction type to dmaengine
- Misc patch to add verification of DMA_MEMSET_SG that was missing
- Addressed all nd_pmem driver comments from Ross.
v2:
- Make dma_prep_memcpy_* into one function per Dan.
-
Adding blk-mq support to the pmem driver in addition to the direct bio
support. This allows for hardware offloading via DMA engines. By default
the bio method will be enabled. The blk-mq support can be turned on via
module parameter queue_mode=1.
Signed-off-by: Dave Jiang
Adding a dmaengine transaction operation that allows copy to/from a
scatterlist and a flat buffer.
Signed-off-by: Dave Jiang
---
Documentation/dmaengine/provider.txt |3 +++
drivers/dma/dmaengine.c |2 ++
include/linux/dmaengine.h|6
DMA_MEMSET_SG is missing the verification of having the operation set and
also a supporting function provided.
Fixes: Commit 50c7cd2bd ("dmaengine: Add scatter-gathered memset")
Signed-off-by: Dave Jiang
---
drivers/dma/dmaengine.c |2 ++
1 file changed, 2
54 matches
Mail list logo