Pekka J Enberg wrote:
Sorry if this is an obvious question but what prevents another thread
from doing mmap() before we do the second walk and messing up num_gh?
Nothing, I suspect. OCFS2 has a problem like this, too. It wants a way
for a file system to serialize mmap/munmap/mremap during
Pekka Enberg wrote:
In addition, the vma walk will become an unmaintainable mess as soon
as someone introduces another mmap() capable fs that needs similar
locking.
Yup, I suspect that if the core kernel ends up caring about this problem
then the VFS will be involved in helping file systems
I'm not sure what the best way to fix this is. One option is to
always make
a copy of the iovec and pass that down. Any other thoughts ?
Can we use this as another motivation to introduce an iovec container
struct instead of passing a raw iov/seg? The transition could turn
hand-rolled
(my monkey test code is on http://kernel-perf.sourceforge.net/
diotest).
Nice.
Do you have any interest in working with the autotest ( http://
test.kernel.org/autotest ) guys to get your tests into their rotation?
- z
-
To unsubscribe from this list: send the line unsubscribe linux-kernel
On Nov 29, 2006, at 2:32 AM, Sébastien Dugué wrote:
compat_sys_io_submit() cleanup
Cleanup compat_sys_io_submit by duplicating some of the native
syscall
logic in the compat layer and directly calling io_submit_one() instead
of fooling the syscall into thinking it is
sys_io_getevents() reads:
uh! ^you must be meaning sys_io_submit()?
Heh, yes, of course. Damn these fingers!
- z
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at
At that time, a patch was written for raw device to demonstrate that
large performance head room is achievable (at ~20% speedup for micro-
benchmark and ~2% for db transaction processing benchmark) with a
tight
I/O submission processing loop.
Where exactly does the benefit come from? icache
On Nov 30, 2006, at 10:16 PM, Chen, Kenneth W wrote:
Zach Brown wrote on Thursday, November 30, 2006 1:45 PM
At that time, a patch was written for raw device to demonstrate that
large performance head room is achievable (at ~20% speedup for
micro-
benchmark and ~2% for db transaction
On Dec 4, 2006, at 8:26 AM, Chen, Kenneth W wrote:
The access_ok() and negative length check on each iov segment in
function
generic_file_aio_read/write are redundant. They are all already
checked
before calling down to these low level generic functions.
...
So it's not possible to
Maybe we should create another internal generic_file_aio_read/write
for in-core function? fs/read_write.c and fs/aio.c are not module-able
and the check is already there. For external module, we can do the
check and then calls down to the internal one.
Maybe. I'd rather see fewer moving
[EMAIL PROTECTED]
That seems to be the case, indeed.
Acked-by: Zach Brown [EMAIL PROTECTED]
- z
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ
On Fri, Jul 27, 2012 at 01:22:10AM +0530, Ankit Jain wrote:
I should probably be doing better tests, any suggestions on what or
how I can test?
Well, is the test actually *doing* anything with these IOs?
Calling io_submit() and then immediately waiting for completion is the
best case for
The idea is simple, leave the desicion for the file system user to enable
file system mount
wide O_DIRECT support with a new mount option, for example,
I believe a better approach to your problem is actually to enable
loopback device driver to use direct IO. Someone was actually
[ ugh, still jet lagged. ]
Hi Nick,
When Matthew was describing this work at an LCA presentation (not
sure whether you were at that presentation or not), Zach came up
with the idea that allowing the submitting application control the
CPU that the io completion processing was occurring
Do you have any userspace code that can be used to get started
experimenting
with your fibril based AIO stuff?
I only have a goofy little test app so far:
http://www.zabbo.net/~zab/aio-walk-tree.c
It's not to be taken too seriously :)
I want to try it on from a userspace
let me clarify this: i very much like your AIO patchset in general, in
the sense that it 'completes' the AIO implementation: finally
everything
can be done via it, greatly increasing its utility and hopefully its
penetration. This is the most important step, by far.
We violently agree on
Wooo ...hold on ... I think this is swinging out of perspective :)
I'm sorry, but I don't. I think using the EIOCBRETRY method in
complicated code paths requires too much maintenance cost to justify
its benefits. We can agree to disagree on that judgement :).
- z
-
To unsubscribe from
That sounds like a programming error, don't you think? Maybe
returning EINVAL is the right approach?
Maybe. I think I'd prefer to be permissive and queue as much as
possible, but it's not a strong preference. Returning EINVAL seems
ok, too.
- z
-
To unsubscribe from this list: send the
Priorities cannot be shared, as they have to adapt to the per-request
priority when we get down to the nitty gitty of POSIX AIO, as
otherwise
realtime issues like keepalive transmits will be handled incorrectly.
Well, maybe not *blind* sharing. But something more than the
disconnect
Other questions really relate to the scheduling - Zach do you intend
schedule_fibrils() to be a call code would make or just from
schedule() ?
I'd much rather keep the current sleeping API in as much as is
possible. So, yeah, if we can get schedule() to notice and behave
accordingly I'd
ok, i think i noticed another misunderstanding. The kernel thread
based
scheme i'm suggesting would /not/ 'switch' to another kernel thread in
the cached case, by default. It would just execute in the original
context (as if it were a synchronous syscall), and the switch to a
kernel thread from
Since I still think that the many-thousands potential async operations
coming from network sockets are better handled with a classical event
machanism [1], and since smooth integration of new async syscall
into the
standard POSIX infrastructure is IMO a huge win, I think we need to
have a
But really, being a scheduler guy i was much more concerned about the
duplication and problems caused by the fibril concept itself - which
duplication and complexity makes up 80% of Zach's submitted patchset.
For example this bit:
[PATCH 3 of 4] Teach paths to wake a specific void * target
+ current-per_call = next-per_call;
Pointer instead of structure copy?
Sure, there are lots of trade-offs there, but the story changes if we
keep the 1:1 relationship between task_struct and thread_info.
- z
-
To unsubscribe from this list: send the line unsubscribe linux-kernel
Or we need some sort of enter_context()/leave_context() (adopt mm,
files,
...) to have a per-CPU kthread to be able to execute the syscall
from the
async() caller context.
I believe that's what Ingo is hoping for, yes.
- z
-
To unsubscribe from this list: send the line unsubscribe
The result of one async operation is basically a cookie and a result
code. Eight or sixteen bytes at most.
s/basically/minimally/
Well, yeah. The patches I sent had:
struct asys_completion {
longreturn_code;
unsigned long cookie;
};
That's as stupid as it gets.
No, that's *really* it ;)
For syscalls, sure.
The kevent work incorporates Uli's desire to have more data per
event. Have you read his OLS stuff? It's been a while since I did
so I've lost the details of why he cares to have more.
Let me say it again, maybe a little louder this time:
- we'd need to do it in the kernel (which is actually nasty, since
different system calls have slightly different semantics - some
don't
return any error value at all, and negative numbers are real
numbers)
- we'd have to teach user space about the negative errno
mechanism, in
It has me excited in any case. Once anything even remotely testable
appears
(Zach tells me not to try the current code), I'll work it into MTasker
(http://ds9a.nl/mtasker) and make it power a nameserver that does
async i/o,
for use with very very large zones that aren't preloaded.
I'll be
That's not how the patches work right now, but yes, I at least
personally
think that it's something we should aim for (ie the interface
shouldn't
_require_ us to always wait for things even if perhaps an early
implementation might make everything be delayed at first)
I agree that we
On Feb 9, 2007, at 6:05 AM, Suparna Bhattacharya wrote:
On Fri, Feb 09, 2007 at 11:40:27AM +0100, Jiri Kosina wrote:
On Fri, 9 Feb 2007, Andrew Morton wrote:
@@ -1204,7 +1204,7 @@ generic_file_aio_read(struct kiocb *iocb,
const struct iovec *iov,
What I have there is not actually a full-blown file io descriptor,
because
there is no file or offset. It is just an iovec iterator (so maybe
I should
rename it to iov_iter, rather than iodesc).
I think it might be a nice idea to keep this iov_iter as a standalone
structure, and it could be
So, reiserfs and NFS are nesting i_mutex inside the mmap_sem.
[b038c6e5] mutex_lock+0x1c/0x1f
[b01b17e9] reiserfs_file_release+0x54/0x447
[b016afe7] __fput+0x53/0x101
[b016b0ee] fput+0x19/0x1c
[b015bcd5] remove_vma+0x3b/0x4d
[b015c659]
So reiser and NFS need to be fixed. No?
Actually, it is rather mmap() needs to be fixed.
Sure, I'm willing to have that demonstrated. My point was that DIO
getting the mmap_sem inside i_mutex is currently correct.
reiserfs, though, seems to be out on a more precarious limb ;).
- z
-
To
won't pack. There are already a host of
conditions under which it won't pack.
Totally untested, but built.
Signed-off-by: Zach Brown [EMAIL PROTECTED]
diff --git a/fs/reiserfs/file.c b/fs/reiserfs/file.c
index a804903..40085f1 100644
--- a/fs/reiserfs/file.c
+++ b/fs/reiserfs/file.c
@@ -46,7
Ugh, I thought the preallocation was getting freed elsewhere, but it
looks like I was wrong. We can't just skip the i_mutex after all,
sorry.
Ah, so none of those tests at the top will stop tail packing if there's
been pre-allocation?
Like, uh, the inode reference count test?
- z
[
cc:ing stable because the initial commit did as well.
Signed-off-by: Zach Brown z...@redhat.com
CC: sta...@kernel.org [2.6.37+]
---
fs/fuse/file.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index b321a68..514f12a 100644
On Tue, Jul 24, 2012 at 05:11:05PM +0530, Ankit Jain wrote:
Currently, io_submit tries to execute the io requests on the
same thread, which could block because of various reaons (eg.
allocation of disk blocks). So, essentially, io_submit ends
up being a blocking call.
Yup, sadly that's how
And most importantly block devices, as they are one of the biggest
use cases of AIO. With an almost no-op get_blocks callback I can't
see how this change would provide any gain there.
Historically we'd often see submission stuck waiting for requests.
Tasks often try to submit way more aio
On Wed, Nov 28, 2012 at 08:43:24AM -0800, Kent Overstreet wrote:
Bunch of performance improvements and cleanups Zach Brown and I have
been working on. The code should be pretty solid at this point, though
it could of course use more review and testing.
Thanks for sending these out. I have
On Wed, Nov 28, 2012 at 08:43:31AM -0800, Kent Overstreet wrote:
Minor refactoring, to get rid of some duplicated code
A minor nit:
spin_lock_irq(ctx-ctx_lock);
- ret = -EAGAIN;
+
kiocb = lookup_kiocb(ctx, iocb, key);
- if (kiocb kiocb-ki_cancel) {
-
struct kioctx {
atomic_tusers;
- int dead;
+ atomic_tdead;
Do we want to be paranoid and atomic_set() that to 0 when the ioctx is
allocated?
+ while (!list_empty(ctx-active_reqs)) {
+ struct list_head *pos
- int i = 0;
+ DEFINE_WAIT(wait);
+ struct hrtimer_sleeper t;
+ size_t i = 0;
Changing i to size_t is kind of surprising. Is that on purpose?
- set_task_state(tsk, TASK_RUNNING);
- remove_wait_queue(ctx-wait, wait);
-
We can't use cmpxchg() on the ring buffer's head pointer directly, since
it's modded to nr_events and would be susceptible to ABA. So instead we
maintain a shadow head that uses the full 32 bits, and cmpxchg() that
and then updated the real head pointer.
Time to update this comment to reflect
On Mon, Oct 01, 2012 at 03:23:41PM -0700, Kent Overstreet wrote:
So, I and other people keep running into things where we really need to
add an interface to pass some auxiliary... stuff along with a pread() or
pwrite().
Sure. Martin (cc:ed) will sympathize.
A few examples:
* IO scheduler
Not just per sector, Per hardware sector. For passing around checksums
userspace would have to find out the hardware sector size and checksum
type/size via a different interface, and then the attribute would
contain a pointer to a buffer that can hold the appropriate number of
checksums.
All
The generic code wouldn't know about any user pointers inside
attributes, so it'd have to be downstream consumers. Hopefully there
won't be many attributes with user pointers in them (I don't expect
there to be), so we won't have too much of this messyness.
I really don't like this. We
The merge processing occurs during kmem_cache_create and you are setting
up the decoder field afterwards! Wont work.
In the thread I suggested providing the callback at destruction:
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg21130.html
I liked that it limits accesibility of
I don't like it :-)
For a fundamental reason or because it happens to not work yet? :)
- z
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the
The latter - it fails the test that I posted.
OK, good. That's easy enough to fix :) I'll send out a tested version.
- z
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at
Second, Oracle is now working on Btrfs (if ever a FS needed a better
name... is that pronounced ButterFS?).
(In our silliest moments, yes. Absolutely.)
- z
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo
+#ifndef CONFIG_STABLE
/*
* We should return 0 if size == 0 (which would result in the
* kmalloc caller to get NULL) but we use the smallest object
@@ -81,6 +82,7 @@ static inline int kmalloc_index(size_t s
* we can discover locations where we do 0 sized
cachemiss_thread should explicitly return 0 or error instead of
task_ret_reg(current) (which is -ENOSYS anyway) because
async_thread_helper is careful to put the return value in eax anyway.
Can you explain what motivated you to send out this patch?
It used to return 0. It was changed
Add a bunch of includes to sys.h and syslet.h to kill off compilation
warnings.
This, and the patches which add tests, all look great to me.
Ingo, are you patching up your tests or do you want me to take care of
these?
- z
-
To unsubscribe from this list: send the line unsubscribe
the demos I sent out. Dunno about the existing ones, but I bet they do
the same.
Hmm, they didn't when I ran them, but I'll give yours a try and take a
closer look. Thanks for taking the time to bring it up.
- z
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the
On Mon, Jun 04, 2007 at 12:31:45PM -0400, Jeff Dike wrote:
Syslets seem like a fundamentally good idea to me, but the current
implementation, using CLONE_THREAD threads, seems like a basic
problem.
It has remaining problems that need to be addressed, yes.
First, there are signals. If the
and then return it.
__exec_atom() sets task_ret_reg() to NULL if there's a chance that it will
block while executing the syscall in the atom.
Signed-off-by: Zach Brown [EMAIL PROTECTED]
diff -r f0d8ee165e2e kernel/async.c
--- a/kernel/async.cThu Jun 07 14:32:31 2007 -0700
+++ b/kernel/async.cThu
I'd just like to take the chance also to ask about a VM/FS meetup some
time around kernel summit (maybe take a big of time during UKUUG or so).
Yeah, I'd be interested.
More issues:
- chris mason's patches to normalize buffered and direct locking
- z
-
To unsubscribe from this list: send
the
lock if the final reference was just dropped. Another CPU might free
the dio in bio completion and reuse the memory after this path drops the
dio lock but before the BUG_ON() is evaluated.
This patch passed aio+dio regression unit tests and aio-stress on ext3.
Signed-off-by: Zach Brown [EMAIL
the BUG_ON(). But unfortunately, our perf. team is able reproduce the
problem.
What are they doing to reproduce it? How much setup does it take?
Debug indicated that, the ret2 == 1 :(
That could be consistent with the theory that we're racing with the
dio struct being freed and reused
FWIW, I believe Andrew's point was that critical information for Joe
Enduser (and Joe Patch-Ho) was lacking in the original changelog.
and don't forget Joe eCryptfs-Maintainer-2-Years-In-The-Future.
- z
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a
I'm pleased to announce the availability of version 6 of the syslet subsystem.
Ingo and I agreed that I'll handle syslet releases while he's busy with CFS. I
copied the cc: list from Ingo's v5 announcement. If you'd like to be dropped
(or added), please let me know.
The v6 patch series against
.. so don't keep us in suspense. Do you have any numbers for anything
(like Oracle, to pick a random thing out of thin air ;) that might
actually indicate whether this actually works or not?
I haven't gotten to running Oracle's database against it. It is going
to be Very Cranky if O_DIRECT
You should pick up the kevent work :)
I haven't looked at it in a while but yes, it's on the radar :).
Having async request and response rings would be quite useful, and most
closely match what is going on under the hood in the kernel and hardware.
Yeah, but I have lots of competing
Yeah, it'll confuse CFQ a lot actually. The threads either need to share
an io context (clean approach, however will introduce locking for things
that were previously lockless), or CFQ needs to get better support for
cooperating processes.
Do let me know if I can be of any help in this.
For
due to the added syscall. (Maybe we can just get that reserved
upstream now?)
Maybe, but we'd have to agree on the bare syslet interface that is being
supported :).
Personally, I'd like that to be the simplest thing that works for people
and I'm not convinced that the current syslet-specific
On Wed, May 30, 2007 at 02:49:03PM +0200, Peter Zijlstra wrote:
Use the lockdep infrastructure to track lock contention and other lock
statistics.
I really like the sound of this.
Has anyone given you an indication of when it might be merged?
- z
-
To unsubscribe from this list: send the
I fear the consequences of this change :(
I love it. In the past I've lost time by working with patches which
didn't quite realize that ext3 holds a transaction open during
-direct_IO.
Oh well, please keep it alive, maybe beat on it a bit, resend it
later on?
I can test the patch to make
What about introducing a new flag, O_COMPR which tells the
kernel, btw, we want this file to be decompressed if it can be. It
can fallback to O_RDONLY or something like that? That gets rid of
the chattr ugliness.
How is that different from chattr ugliness, which also comes down to a
@@ -50,7 +50,7 @@ static void adfs_write_failed(struct address_space
*mapping, loff_t to)
struct inode *inode = mapping-host;
if (to inode-i_size)
- truncate_pagecache(inode, to, inode-i_size);
+ truncate_pagecache(inode, inode-i_size);
}
All these
On Mon, Aug 26, 2013 at 10:02:59PM +, Nicholas A. Bellinger wrote:
From: Nicholas Bellinger n...@daterainc.com
Hi folks,
This -v2 series adds support to target-core for generic EXTENDED_COPY offload
emulation as defined by SPC-4 using virtual (IBLOCK, FILEIO, RAMDISK)
backends.
Cool,
if the caller wants to avoid unaccelerated copying,
perhaps by setting behavioural flags.
The SPLICE_F_DIRECT flag is arguably misused here to indicate both
file-to-file direct splicing *and* acceleration.
Signed-off-by: Zach Brown z...@redhat.com
---
fs/bad_inode.c | 8
fs/splice.c
When I first started on this stuff I followed the lead of previous
work and added a new syscall for the copy operation:
https://lkml.org/lkml/2013/5/14/618
Towards the end of that thread Eric Wong asked why we didn't just
extend splice. I immediately replied with some dumb dismissive
answer.
lets the file system lock both for
the duration of the copy, should it need to. If the method refuses to
accelerate the copy, for whatever reason, we can naturally fall back to
the generic direct splice method that sendfile uses today.
Signed-off-by: Zach Brown z...@redhat.com
---
fs/splice.c
() already does elsewhere) is moved to a
new much smaller btrfs_ioctl_clone().
btrfs_splice_direct() thus inherits the conservative limitations of the
btrfs clone ioctl: it only allows block-aligned copies between files on
the same snapshot.
Signed-off-by: Zach Brown z...@redhat.com
---
fs/btrfs
That make sense? I can show you more concretely what I'm working on if
you want. Or if I'm full of crap and this is useless for what you guys
want I'm sure you'll let me know :)
It sounds interesting, but also a little confusing at this point, at
least from the non-block side of
- app calls splice(from, 0, to, 0, SIZE_MAX)
1) VFS calls -direct_splice(from, 0, to, 0, SIZE_MAX)
1.a) fs reflinks the whole file in a jiffy and returns the size of the
file
1 b) fs does copy offload of, say, 64MB and returns 64M
2) VFS does page copy of, say, 1MB and returns
As for aio-direct... Two questions:
* had anybody tried to measure the effect on branch predictor from
introducing that method vector? Commit d6afd4c4 (iov_iter: hide iovec
details behind ops function pointers)
FWIW, I never did. I only went that route to begin with because the few
I've got an alternate approach for fixing this wart in lookup_ioctx()...
Instead of using an rbtree, just use the reserved id in the ring buffer
header to index an array pointing the ioctx. It's not finished yet, and
it needs to be tidied up, but is most of the way there.
Yeah, that
Time for new open source pastures outside the kernel, for me.
Thanks for all your hard work over the years. Here's to good luck in
the future!
- z
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info
+static void sort_parents3(struct dentry **p)
+void sort_parents(struct dentry **p, unsigned *nump)
Yikes, that's a bunch of fiddly code. Is it *really* worth all that to
avoid calling the generic sort helpers?
AFAICS, I cannot make the compare function transitive, e.g.: A is
I ended up working on this a bit today, and managed to cobble together
something that somewhat works -- please see the patch below.
Just some quick observations:
+ ctx-ctx_file = anon_inode_getfile([aio], aio_ctx_fops, ctx, O_RDWR);
+ if (IS_ERR(ctx-ctx_file)) {
+
On Tue, May 21, 2013 at 07:47:19PM +, Eric Wong wrote:
Zach Brown z...@redhat.com wrote:
On Wed, May 15, 2013 at 07:44:05PM +, Eric Wong wrote:
Why introduce a new syscall instead of extending sys_splice?
Personally, I think it's ugly to have different operations use the same
Some quick thoughts:
Permute the location of files. E.g. 'permute(A, B, C)' is equivalent to
A-B,
B-C and C-A. This is essentially a series of renames done as a single
atomic
operation.
Hmm. Can we choose a more specific name than 'permute'? To me,
-permute() tells me just as much
Add sys_copy_range to the x86 syscall tables. Happily, it doesn't
require compat helpers.
Signed-off-by: Zach Brown z...@redhat.com
---
arch/x86/syscalls/syscall_32.tbl | 1 +
arch/x86/syscalls/syscall_64.tbl | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/x86/syscalls/syscall_32.tbl
This crude patch illustrates the simplest plumbing involved in
supporting sys_call_range with the NFS COPY operation that's pending in
the 4.2 draft spec.
The patch is based on a previous prototype that used the COPY op to
implement sys_copyfileat which created a new file (based on the ocfs2
We've been talking about implementing some form of bulk data copy
offloading for a while now. BTRFS and OCFS2 implement forms of copy
offloading with ioctls, NFS 4.2 will include a byte-granular COPY
operation, and the SCSI XCOPY command is being implemented now that
Windows can issue it.
In the
the
CLONE_RANGE ioctl and copy_range syscall.
Signed-off-by: Zach Brown z...@redhat.com
---
fs/btrfs/ctree.h | 3 ++
fs/btrfs/file.c | 1 +
fs/btrfs/ioctl.c | 122 +--
3 files changed, 77 insertions(+), 49 deletions(-)
diff --git a/fs/btrfs/ctree.h
mpage.o ioprio.o
diff --git a/fs/copy_range.c b/fs/copy_range.c
new file mode 100644
index 000..3000b9f
--- /dev/null
+++ b/fs/copy_range.c
@@ -0,0 +1,127 @@
+/*
+ * copy_range: offload data copying between existing files
+ *
+ * Copyright (C) 2013 Zach Brown z...@redhat.com
+ */
+#include linux/fs.h
On Wed, May 15, 2013 at 07:42:51AM +1000, Dave Chinner wrote:
On Tue, May 14, 2013 at 02:15:22PM -0700, Zach Brown wrote:
I'm going to keep hacking away at this. My next step is to get ext4
supporting .copy_range, probably with a quick hack to copy the
contents of bios. Hopefully that'll
On Wed, May 15, 2013 at 07:44:05PM +, Eric Wong wrote:
Why introduce a new syscall instead of extending sys_splice?
Personally, I think it's ugly to have different operations use the same
syscall just because their arguments match.
But that preference aside, sure, if the consensus is that
Hrmph. I had composed a reply to you during Plumbers but.. something
happened to it :). Here's another try now that I'm back.
Some things to talk about:
- I really don't care about the naming here. If you do, holler.
- We might want different flags for file-to-file splicing and
On Wed, Sep 25, 2013 at 03:02:29PM -0400, Anna Schumaker wrote:
On Wed, Sep 25, 2013 at 2:38 PM, Zach Brown z...@redhat.com wrote:
Hrmph. I had composed a reply to you during Plumbers but.. something
happened to it :). Here's another try now that I'm back.
Some things to talk about
A client-side copy will be slower, but I guess it does have the
advantage that the application can track progress to some degree, and
abort it fairly quickly without leaving the file in a totally undefined
state--and both might be useful if the copy's not a simple constant-time
operation.
I
void zero_fill_bio(struct bio *bio)
{
- unsigned long flags;
struct bio_vec bv;
struct bvec_iter iter;
- bio_for_each_segment(bv, bio, iter) {
+#if defined(CONFIG_HIGHMEM) || defined(ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE)
+ bio_for_each_page(bv, bio, iter) {
+
On Wed, Sep 25, 2013 at 02:49:10PM -0700, Kent Overstreet wrote:
On Wed, Sep 25, 2013 at 02:17:02PM -0700, Zach Brown wrote:
void zero_fill_bio(struct bio *bio)
{
- unsigned long flags;
struct bio_vec bv;
struct bvec_iter iter;
- bio_for_each_segment(bv, bio, iter
Sigh. A pox on whoever thought up huge pages.
managing 1TB+ of memory in 4K chunks is just insane.
The question of larger pages is not if, but only when.
And how!
Sprinking a bunch of magical if (thp) {} else {} throughtout the code
looks like a stunningly bad idea to me. It'd take real
On Thu, Sep 26, 2013 at 10:58:05AM +0200, Miklos Szeredi wrote:
On Wed, Sep 25, 2013 at 11:07 PM, Zach Brown z...@redhat.com wrote:
A client-side copy will be slower, but I guess it does have the
advantage that the application can track progress to some degree, and
abort it fairly quickly
On Thu, Sep 26, 2013 at 08:06:41PM +0200, Miklos Szeredi wrote:
On Thu, Sep 26, 2013 at 5:34 PM, J. Bruce Fields bfie...@fieldses.org wrote:
On Thu, Sep 26, 2013 at 10:58:05AM +0200, Miklos Szeredi wrote:
On Wed, Sep 25, 2013 at 11:07 PM, Zach Brown z...@redhat.com wrote:
A client-side
Sure. So we'd have:
- no flag default that forbids knowingly copying with shared references
so that it will be used by default by people who feel strongly about
their assumptions about independent write durability.
- a flag that allows shared references for people who would
201 - 300 of 1040 matches
Mail list logo