from:"Eric Van Hensbergen"

Re: 9pfs hangs since 4.7

2016-11-29 Thread Eric Van Hensbergen

Any idea of what xfstests is doing at this point in time?  I'd be a
bit worried about some sort of loop in the namespace since it seems to
be in path traversal.  Could also be some sort of resource leak or
fragmentation, I'll admit that many of the regression tests we do are
fairly short in duration.  Another approach would be to look at doing
this with a different server (over a network link instead of virtio)
to isolate it as a client versus server side problem (although from
the looks of things this does seem to be a client issue).

On Thu, Nov 24, 2016 at 1:50 PM, Tuomas Tynkkynen  wrote:
> Hi fsdevel,
>
> I have been observing hangs when running xfstests generic/224. Curiously
> enough, the test is *not* causing problems on the FS under test (I've
> tried both ext4 and f2fs) but instead it's causing the 9pfs that I'm
> using as the root filesystem to crap out.
>
> How it shows up is that the test doesn't finish in time (usually
> takes ~50 sec) but the hung task detector triggers for some task in
> d_alloc_parallel():
>
> [  660.701646] INFO: task 224:7800 blocked for more than 300 seconds.
> [  660.702756]   Not tainted 4.9.0-rc5 #1-NixOS
> [  660.703232] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [  660.703927] 224 D0  7800549 0x
> [  660.704501]  8a82ec022800  8a82fc03c800 
> 8a82ff217dc0
> [  660.705302]  8a82d0f88c00 a94a41a27b88 aeb4ad1d 
> a94a41a27b78
> [  660.706125]  ae800fc6 8a82fbd90f08 8a82d0f88c00 
> 8a82fbfd5418
> [  660.706924] Call Trace:
> [  660.707185]  [] ? __schedule+0x18d/0x640
> [  660.707751]  [] ? __d_alloc+0x126/0x1e0
> [  660.708304]  [] schedule+0x36/0x80
> [  660.708841]  [] d_alloc_parallel+0x3a7/0x480
> [  660.709454]  [] ? wake_up_q+0x70/0x70
> [  660.710007]  [] lookup_slow+0x73/0x140
> [  660.710572]  [] walk_component+0x1ca/0x2f0
> [  660.711167]  [] ? path_init+0x1d9/0x330
> [  660.711747]  [] ? mntput+0x24/0x40
> [  660.716962]  [] path_lookupat+0x5d/0x110
> [  660.717581]  [] filename_lookup+0x9e/0x150
> [  660.718194]  [] ? kmem_cache_alloc+0x156/0x1b0
> [  660.719037]  [] ? getname_flags+0x56/0x1f0
> [  660.719801]  [] ? getname_flags+0x72/0x1f0
> [  660.720492]  [] user_path_at_empty+0x36/0x40
> [  660.721206]  [] vfs_fstatat+0x53/0xa0
> [  660.721980]  [] SYSC_newstat+0x1f/0x40
> [  660.722732]  [] SyS_newstat+0xe/0x10
> [  660.723702]  [] entry_SYSCALL_64_fastpath+0x1a/0xa9
>
> SysRq-T is full of things stuck inside p9_client_rpc like:
>
> [  271.703598] bashS0   100 96 0x
> [  271.703968]  8a82ff824800  8a82faee4800 
> 8a82ff217dc0
> [  271.704486]  8a82fb946c00 a94a404ebae8 aeb4ad1d 
> 8a82fb9fc058
> [  271.705024]  a94a404ebb10 ae8f21f9 8a82fb946c00 
> 8a82fbbba000
> [  271.705542] Call Trace:
> [  271.705715]  [] ? __schedule+0x18d/0x640
> [  271.706079]  [] ? idr_get_empty_slot+0x199/0x3b0
> [  271.706489]  [] schedule+0x36/0x80
> [  271.706825]  [] p9_client_rpc+0x12a/0x460 [9pnet]
> [  271.707239]  [] ? idr_alloc+0x87/0x100
> [  271.707596]  [] ? wake_atomic_t_function+0x60/0x60
> [  271.708043]  [] p9_client_walk+0x77/0x200 [9pnet]
> [  271.708459]  [] v9fs_vfs_lookup.part.16+0x59/0x120 [9p]
> [  271.708912]  [] v9fs_vfs_lookup+0x1f/0x30 [9p]
> [  271.709308]  [] lookup_slow+0x96/0x140
> [  271.709664]  [] walk_component+0x1ca/0x2f0
> [  271.710036]  [] ? path_init+0x1d9/0x330
> [  271.710390]  [] path_lookupat+0x5d/0x110
> [  271.710763]  [] filename_lookup+0x9e/0x150
> [  271.711136]  [] ? mem_cgroup_commit_charge+0x7e/0x4a0
> [  271.711581]  [] ? kmem_cache_alloc+0x156/0x1b0
> [  271.711977]  [] ? getname_flags+0x56/0x1f0
> [  271.712349]  [] ? getname_flags+0x72/0x1f0
> [  271.712726]  [] user_path_at_empty+0x36/0x40
> [  271.713110]  [] vfs_fstatat+0x53/0xa0
> [  271.713454]  [] SYSC_newstat+0x1f/0x40
> [  271.713810]  [] SyS_newstat+0xe/0x10
> [  271.714150]  [] entry_SYSCALL_64_fastpath+0x1a/0xa9
>
> [  271.729022] sleep   S0   218216 0x0002
> [  271.729391]  8a82fb990800  8a82fc0d8000 
> 8a82ff317dc0
> [  271.729915]  8a82fbbec800 a94a404f3cf8 aeb4ad1d 
> 8a82fb9fc058
> [  271.730426]  ec95c1ee08c0 0001 8a82fbbec800 
> 8a82fbbba000
> [  271.730950] Call Trace:
> [  271.731115]  [] ? __schedule+0x18d/0x640
> [  271.731479]  [] schedule+0x36/0x80
> [  271.731814]  [] p9_client_rpc+0x12a/0x460 [9pnet]
> [  271.732226]  [] ? wake_atomic_t_function+0x60/0x60
> [  271.732649]  [] p9_client_clunk+0x38/0xb0 [9pnet]
> [  271.733061]  [] v9fs_dir_release+0x1a/0x30 [9p]
> [  271.733494]  [] __fput+0xdf/0x1f0
> [  271.733844]  [] fput+0xe/0x10
> [  271.734176]  [] task_work_run+0x7e/0xa0
> [  271.734532]  [] do_exit+0x2b9/0xad0
> [  271.734888]  [] ? __do_page_fault+0x287/0x4b0
> [  271.735276]  [] do_group_exit+0x43/0xb0
> [  271.7

Re: [V9fs-developer] [PATCH] 9p: trans_fd, initialize recv fcall properly if not set

2015-09-07 Thread Eric Van Hensbergen

I thought the nature of trans_fd would have prevented any sort of true
zero copy, but I suppose one less is always welcome :)

-eric


On Sun, Sep 6, 2015 at 1:55 AM, Dominique Martinet
 wrote:
> Eric Van Hensbergen wrote on Sat, Sep 05, 2015:
>> On Thu, Sep 3, 2015 at 4:38 AM, Dominique Martinet
>>  wrote:
>> > To be honest, I think it might be better to just bail out if we get in
>> > this switch (m->req->rc == NULL after p9_tag_lookup) and not try to
>> > allocate more, because if we get there it's likely a race condition and
>> > silently re-allocating will end up in more troubles than trying to
>> > recover is worth.
>> > Thoughts ?
>> >
>>
>> Hmmm...trying to rattle my brain and remember why I put it in there
>> back in 2008.
>> It might have just been over-defensive programming -- or more likely it just
>> pre-dated all the zero copy infrastructure which pretty much guaranteed we 
>> had
>> an rc allocated and what is there is vestigial.  I'm happy to accept a
>> patch which
>> makes this an assert, or perhaps just resets the connection because something
>> has gone horribly wrong (similar to the ENOMEM path that is there now).
>
> Yeah, it looks like the safety comes from the zero-copy stuff that came
> much later.
> Let's go with resetting the connection then. Hmm. EIO is a bit too
> generic so would be good to avoid that if possible, but can't think of
> anything better...
>
>
> Speaking of zero-copy, I believe it should be fairly straight-forward to
> implement for trans_fd now I've actually looked at it, since we do the
> payload read after a p9_tag_lookup, would just need m->req to point to a
> zc buffer. Write is similar, if there's a zc buffer just send it after
> the header.
> The cost is a couple more pointers in req and an extra if in both
> workers, that seems pretty reasonable.
>
> Well, I'm not using trans_fd much here (and unfortunately zero-copy
> isn't possible at all given the transport protocol for RDMA, at least
> for recv), but if anyone cares it probably could be done without too
> much hassle for the fd workers.
>
> --
> Dominique
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 9p: trans_fd, initialize recv fcall properly if not set

2015-09-05 Thread Eric Van Hensbergen

On Thu, Sep 3, 2015 at 4:38 AM, Dominique Martinet
 wrote:
> That code really should never be called (rc is allocated in
> tag_alloc), but if it had been it couldn't have worked...
>
> Signed-off-by: Dominique Martinet 
> ---
>  net/9p/trans_fd.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> To be honest, I think it might be better to just bail out if we get in
> this switch (m->req->rc == NULL after p9_tag_lookup) and not try to
> allocate more, because if we get there it's likely a race condition and
> silently re-allocating will end up in more troubles than trying to
> recover is worth.
> Thoughts ?
>

Hmmm...trying to rattle my brain and remember why I put it in there
back in 2008.
It might have just been over-defensive programming -- or more likely it just
pre-dated all the zero copy infrastructure which pretty much guaranteed we had
an rc allocated and what is there is vestigial.  I'm happy to accept a
patch which
makes this an assert, or perhaps just resets the connection because something
has gone horribly wrong (similar to the ENOMEM path that is there now).

-eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p changes for 4.3 merge window (part-1)

2015-09-05 Thread Eric Van Hensbergen

The following changes since commit eb63b34bdfbdd70a734c2a90d89117c5c6c605c2:

  Merge branch 'upstream' of
git://git.linux-mips.org/pub/scm/ralf/upstream-linus (2015-08-23
07:23:09 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git
tags/for-linus-4.3-merge-window-part-1

for you to fetch changes up to b5ac1fb2717e48177d3f73f9e4c9b556c0a24c6b:

  9p: fix return code of read() when count is 0 (2015-08-23 14:21:36 -0500)


Just a few cleanups for 4.3 merge window for the 9p file system.
I've gotten several more over the past week, but this group has been
in for-next for at least a couple of weeks so I figured I'd push them
first while I test the rest.  Most of the ones not in this set are
bug-fixes anyways so I could hold them for rc1 if you'd rather they
see more time in for-next.

-eric


Fabian Frederick (1):
  9p: remove unused option Opt_trans

Vincent Bernat (1):
  9p: fix return code of read() when count is 0

 fs/9p/v9fs.c | 2 +-
 fs/9p/vfs_file.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p: patches for the 4.1 merge window

2015-04-18 Thread Eric Van Hensbergen

The following changes since commit b314acaccd7e0d55314d96be4a33b5f50d0b3344:

  Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input (2015-03-19
16:43:10 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git
tags/for-linus-4.1-merge-window

for you to fetch changes up to f569d3ef8254d4b3b8daa4f131f9397d48bf296c:

  net/9p: add a privport option for RDMA transport. (2015-03-21 19:32:33 -0700)


9p: patches for 4.1 merge window

Some accumulated cleanup patches for kerneldoc and unused variables
as well as some lock bug fixes and adding privateport option for RDMA.

A quick check shows some merge-conflicts versus current-tip on
   9p: use unsigned integers for nwqid/count
If you would prefer I can rebase, remerge and fix the patch but didn't
want to do that and look the for-next references.

Signed-off-by: Eric Van Hensbergen 


Andrey Ryabinin (1):
  net/9p: use memcpy() instead of snprintf() in p9_mount_tag_show()

Dominique Martinet (3):
  net/9p: Initialize opts->privport as it should be.
  fs/9p: Initialize status in v9fs_file_do_lock.
  net/9p: add a privport option for RDMA transport.

Fabian Frederick (2):
  9p: kerneldoc warning fixes
  9p: remove unused variable in p9_fd_create()

Kirill A. Shutemov (3):
  9p: fix error handling in v9fs_file_do_lock
  9p: do not crash on unknown lock status code
  9p: use unsigned integers for nwqid/count

 fs/9p/v9fs.h  |  1 -
 fs/9p/vfs_addr.c  |  2 --
 fs/9p/vfs_file.c  | 10 ++
 net/9p/protocol.c |  6 +++---
 net/9p/trans_fd.c |  3 +--
 net/9p/trans_rdma.c   | 52 +
 net/9p/trans_virtio.c |  5 -
 7 files changed, 58 insertions(+), 21 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Make v9fs uname and remotename parsing more robust

2008-02-24 Thread Eric Van Hensbergen

On Sat, Feb 23, 2008 at 2:07 AM, Andrew Morton
<[EMAIL PROTECTED]> wrote:
>
>  It would be better to present this as two patches.  One adds the new core
>  APIs and the other uses those APIs in v9fs.  The patches would take
>  separate routes into mainline.
>
>  I guess I can sneak this one in as-is, as long as the v9fs guys are OK with
>  that?
>

I'm fine with it.  Shall I pull it through the v9fs-devel patch line
or would you rather send it with your patches Andrew?

-eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PULL] v9fs patches for merge window

2008-02-07 Thread Eric Van Hensbergen

On Feb 6, 2008 8:43 PM, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Wed, 6 Feb 2008 19:39:26 -0600 "Eric Van Hensbergen" <[EMAIL PROTECTED]> 
> wrote:
>
> Could you please cc me on pull requests?  I need to pay more attention to
> them.  Thanks.
>
> > Andrew Morton (1):
> >   9p: fix p9_printfcall export
>
> Really this should have been folded into the patch which it fixes.  We get
> a cleaner history that way, and it protects git-bisectability.
>

I would have, but I didn't see the original offender in my upstream
branch, so I just applied it separately - looks to me like fcprint.c
hasn't been touched (until your patch) since its introduction:

[EMAIL PROTECTED]:~/src/linux/9p$ git log net/9p/fcprint.c
commit bd238fb431f31989898423c8b6496bc8c4204a86
Author: Latchesar Ionkov <[EMAIL PROTECTED]>
Date:   Tue Jul 10 17:57:28 2007 -0500

9p: Reorganization of 9p file system code

This patchset moves non-filesystem interfaces of v9fs from fs/9p to net/9p.
It moves the transport, packet marshalling and connection layers to net/9p
leaving only the VFS related files in fs/9p.  This work is being done in
preparation for in-kernel 9p servers as well as alternate 9p clients (other
than VFS).

    Signed-off-by: Latchesar Ionkov <[EMAIL PROTECTED]>
Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

-eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PULL] v9fs patches for merge window

2008-02-06 Thread Eric Van Hensbergen

The following changes since commit 3e6bdf473f489664dac4d7511d26c7ac3dfdc748:
  Linus Torvalds (1):
Merge git://git.kernel.org/.../x86/linux-2.6-x86

are found in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git

Andrew Morton (1):
  9p: fix p9_printfcall export

Anthony Liguori (2):
  9p: add support for sticky bit
  9p: Convert semaphore to spinlock for p9_idpool

Eric Van Hensbergen (7):
  9p: create transport rpc cut-thru
  9p: block-based virtio client
  9p: fix bug in attach-per-user
  9p: Fix soft lockup in virtio transport
  9p: fix mmap to be read-only
  9p: add remove function to trans_virtio
  9p: transport API reorganization

Martin Stava (1):
  9p: fix bug in p9_clone_stat

 fs/9p/fid.c|4 +-
 fs/9p/v9fs.c   |   51 +--
 fs/9p/v9fs.h   |5 +-
 fs/9p/vfs_file.c   |4 +-
 fs/9p/vfs_inode.c  |5 +
 include/net/9p/9p.h|1 +
 include/net/9p/client.h|5 +-
 include/net/9p/conn.h  |   57 ---
 include/net/9p/transport.h |   11 +-
 net/9p/Makefile|1 -
 net/9p/client.c|  161 +--
 net/9p/fcprint.c   |4 +-
 net/9p/mod.c   |9 +-
 net/9p/mux.c   | 1060 --
 net/9p/trans_fd.c  | 1103 +++-
 net/9p/trans_virtio.c  |  355 +--
 net/9p/util.c  |   20 +-
 17 files changed, 1466 insertions(+), 1390 deletions(-)
 delete mode 100644 include/net/9p/conn.h
 delete mode 100644 net/9p/mux.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p: 2.6.24-rc1 patches

2007-11-06 Thread Eric Van Hensbergen

Linus,
   Please pull the following bug-fixes for v9fs.

The following changes since commit 2655e2cee2d77459fcb7e10228259e4ee0328697:
  Alan Cox (1):
ata_piix: Add additional PCI identifier for 40 wire short cable

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git for-linus

Latchesar Ionkov (4):
  9p: fix memory leak in v9fs_get_sb
  9p: use copy of the options value instead of original
  9p: return NULL when trans not found
  9p: add missing end-of-options record for trans_fd

 fs/9p/v9fs.c  |6 --
 fs/9p/vfs_super.c |3 +++
 net/9p/mod.c  |4 ++--
 net/9p/trans_fd.c |3 ++-
 4 files changed, 11 insertions(+), 5 deletions(-)

   Thanks,
 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V9fs-developer] [PATCH] 9p: v9fs_vfs_rename incorrect clunk order

2007-10-23 Thread Eric Van Hensbergen

On 10/22/07, Latchesar Ionkov <[EMAIL PROTECTED]> wrote:
> In v9fs_vfs_rename function labels don't match the fids that are clunked.
> The correct clunk order is clunking newdirfid first and then olddirfid next.
>
> Signed-off-by: Latchesar Ionkov <[EMAIL PROTECTED]>
Acked-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] fs/9p/v9fs.c: memleak fix

2007-10-23 Thread Eric Van Hensbergen

On 10/19/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> This patch fixes a memory leak introduced by
> commit ba17674fe02909fef049fd4b620a2805bdb8c693.
>
> Spotted by the Coverity checker.
>
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
Acked-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p: 2.6.24 patches (phase 2)

2007-10-23 Thread Eric Van Hensbergen

Linus, please pull from the 'for-linus' branch of:
 git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git/ for-linus

This tree contains the following:

Latchesar Ionkov(1):
 9p: v9fs_vfs_rename incorrect clunk order

Adrian Bunk(1):
 9p: fix memleak in fs/9p/v9fs.c

Eric Van Hensbergen(1)
 9p: add virtio transport

 Documentation/filesystems/9p.txt |8
 fs/9p/v9fs.c |1
 fs/9p/vfs_inode.c|4
 include/linux/virtio_9p.h|   10 +
 net/9p/Kconfig   |7
 net/9p/Makefile  |4
 net/9p/trans_virtio.c|  353 +++
 7 files changed, 382 insertions(+), 5 deletions(-)

 Thanks,
-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] 9p patches for 2.6.24 merge window

2007-10-17 Thread Eric Van Hensbergen

On 10/17/07, Sam Ravnborg <[EMAIL PROTECTED]> wrote:
> On Wed, Oct 17, 2007 at 04:34:02PM -0500, Eric Van Hensbergen wrote:
> > Linus, please pull from the 'for-linus' branch of:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git/ for-linus
> >
> > This tree contains the following:
> >
> > Latchesar Ionkov(3):
> >   attach-per-user support
> >   rename uid and gid parameters
> >   define session flags
> >
> > Eric Van Hensbergen(4)
> >   remove sysctl code
> >   fix bad kconfig cross-dependency
> >   soften invalidationin loose mode
> >   make transports dynamic
>
> Could you please tag your patches with 9p: or [9p] so it
> is obvious that they belong to this subsystem.
> When browsing head-commits and other places this is a great help.
>

They should be so tagged, I just stripped it in my pull email summary.

   -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p patches for 2.6.24 merge window

2007-10-17 Thread Eric Van Hensbergen

Linus, please pull from the 'for-linus' branch of:
  git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git/ for-linus

This tree contains the following:

Latchesar Ionkov(3):
  attach-per-user support
  rename uid and gid parameters
  define session flags

Eric Van Hensbergen(4)
  remove sysctl code
  fix bad kconfig cross-dependency
  soften invalidationin loose mode
  make transports dynamic

There are a few patches relating to a virtio transport support that
I'm holding back until I know Rusty's lguest series is merged.

 b/Documentation/filesystems/9p.txt |   22 +
 b/fs/9p/fid.c  |  157 +++--
 b/fs/9p/v9fs.c |  189 +++-
 b/fs/9p/v9fs.h |   38 +--
 b/fs/9p/vfs_file.c |6
 b/fs/9p/vfs_inode.c|   50 ++--
 b/fs/9p/vfs_super.c|   19 -
 b/include/net/9p/9p.h  |   21 -
 b/include/net/9p/client.h  |9
 b/include/net/9p/conn.h|4
 b/include/net/9p/transport.h   |   27 +-
 b/net/9p/Kconfig   |   10
 b/net/9p/Makefile  |5
 b/net/9p/client.c  |   13 -
 b/net/9p/conv.c|   32 ++
 b/net/9p/mod.c |   71 +-
 b/net/9p/mux.c |5
 b/net/9p/trans_fd.c|  419 +++--
 net/9p/sysctl.c|   81 ---
 19 files changed, 689 insertions(+), 489 deletions(-)


Thanks,
  -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/5][9PFS] Cleanup explicit check for mandatory locks

2007-09-17 Thread Eric Van Hensbergen

On 9/17/07, Pavel Emelyanov <[EMAIL PROTECTED]> wrote:
> The __mandatory_lock(inode) macro makes the same check, but
> makes the code more readable.
>
> Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>
> Acked-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
>
> ---
>
>  fs/9p/vfs_file.c |2 +-
>  1 files changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
> index 2a40c29..7166916 100644
> --- a/fs/9p/vfs_file.c
> +++ b/fs/9p/vfs_file.c
> @@ -105,7 +105,7 @@ static int v9fs_file_lock(struct file *f
> P9_DPRINTK(P9_DEBUG_VFS, "filp: %p lock: %p\n", filp, fl);
>
> /* No mandatory locks */
> -   if ((inode->i_mode & (S_ISGID | S_IXGRP)) == S_ISGID)
> +   if (__mandatory_lock(inode))
> return -ENOLCK;
>
> if ((IS_SETLK(cmd) || IS_SETLKW(cmd)) && fl->fl_type != F_UNLCK) {
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH] 9p: add readahead support for loose mode

2007-09-14 Thread Eric Van Hensbergen

This patch adds readpages support in support of readahead when using loose
cache mode.  It substantially increases performance for certain workloads.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 fs/9p/v9fs.c|2 +-
 fs/9p/vfs_addr.c|   98 ++
 include/net/9p/client.h |3 +-
 net/9p/client.c |   82 +--
 4 files changed, 143 insertions(+), 42 deletions(-)

diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index 89ee0ba..ca97404 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -131,7 +131,7 @@ static void v9fs_parse_options(struct v9fs_session_info 
*v9ses)
char *s, *e;
 
/* setup defaults */
-   v9ses->maxdata = 8192;
+   v9ses->maxdata = (64*1024);
v9ses->afid = ~0;
v9ses->debug = 0;
v9ses->cache = 0;
diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index 6248f0e..86c6e0d 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -31,8 +31,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -50,31 +53,108 @@
 
 static int v9fs_vfs_readpage(struct file *filp, struct page *page)
 {
-   int retval;
loff_t offset;
char *buffer;
struct p9_fid *fid;
+   int retval = 0;
+   int total = 0;
+   int count = PAGE_SIZE;
 
P9_DPRINTK(P9_DEBUG_VFS, "\n");
fid = filp->private_data;
buffer = kmap(page);
offset = page_offset(page);
 
-   retval = p9_client_readn(fid, buffer, offset, PAGE_CACHE_SIZE);
-   if (retval < 0)
-   goto done;
+   while (count) {
+   struct kvec kv = {buffer+offset, PAGE_SIZE-count};
+   retval = p9_client_readv(fid, &kv, offset, 1);
+   if (retval <= 0)
+   break;
 
-   memset(buffer + retval, 0, PAGE_CACHE_SIZE - retval);
-   flush_dcache_page(page);
-   SetPageUptodate(page);
-   retval = 0;
+   buffer += retval;
+   offset += retval;
+   count -= retval;
+   total += retval;
+   }
+
+   if (retval >= 0) {
+   flush_dcache_page(page);
+   SetPageUptodate(page);
+   retval = 0;
+   }
 
-done:
kunmap(page);
unlock_page(page);
return retval;
 }
 
+/* large chunks copied and adapted from fs/cifs/file.c */
+static int v9fs_vfs_readpages(struct file *file, struct address_space *mapping,
+   struct list_head *page_list, unsigned num_pages)
+{
+   struct page *tmp_page;
+   loff_t offset;
+   struct pagevec lru_pvec;
+   struct p9_fid *fid;
+   u32 read_size;
+   int retval = 0;
+   unsigned int count = 0;
+   struct list_head *p, *n;
+
+   struct kvec *kv = kmalloc(sizeof(struct kvec)*num_pages, GFP_KERNEL);
+
+   P9_DPRINTK(P9_DEBUG_VFS, "%d pages\n", num_pages);
+
+   if (!kv)
+   return -ENOMEM;
+
+   if (list_empty(page_list))
+   goto free_vec;
+
+   pagevec_init(&lru_pvec, 0);
+
+   fid = file->private_data;
+   tmp_page = list_entry(page_list->prev, struct page, lru);
+   offset = (loff_t)tmp_page->index << PAGE_CACHE_SHIFT;
+
+   list_for_each_entry_reverse(tmp_page, page_list, lru) {
+   BUG_ON(count > num_pages);
+   if (add_to_page_cache(tmp_page, mapping,
+   tmp_page->index, GFP_KERNEL)) {
+   page_cache_release(tmp_page);
+   continue;
+   }
+
+   kv[count].iov_base = kmap(tmp_page);
+   kv[count].iov_len = PAGE_CACHE_SIZE;
+   count++;
+   }
+
+   read_size = count * PAGE_CACHE_SIZE;
+   if (!read_size)
+   goto cleanup;
+
+   retval = p9_client_readv(fid, kv, offset, count);
+
+cleanup:
+   list_for_each_safe(p, n, page_list) {
+   tmp_page = list_entry(p, struct page, lru);
+   list_del(&tmp_page->lru);
+   if (!pagevec_add(&lru_pvec, tmp_page))
+   __pagevec_lru_add(&lru_pvec);
+   kunmap(tmp_page);
+   flush_dcache_page(tmp_page);
+   SetPageUptodate(tmp_page);
+   unlock_page(tmp_page);
+   }
+   pagevec_lru_add(&lru_pvec);
+
+free_vec:
+   kfree(kv);
+   return retval;
+}
+
 const struct address_space_operations v9fs_addr_operations = {
   .readpage = v9fs_vfs_readpage,
+  .readpages = v9fs_vfs_readpages,
 };
diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index 9b9221a..6f17d0a 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -67,8 +67,7 @@ int p9_client_fcreate(struct p9_fid *fid, char *name, u32 
perm, int mode,

Re: [PATCH] 9p: attach-per-user

2007-09-13 Thread Eric Van Hensbergen

On 9/12/07, Latchesar Ionkov <[EMAIL PROTECTED]> wrote:
>
> - allow only one user to access the tree (access=)
>  Only the user with uid can access the v9fs tree. Other users that attempt
>  to access it will get EPERM error.
>

While access= and dfltuid= creates an interesting
flexibility in the way things can be used, I'm wondering if
access= dfltuid=DEFAULT_UID is intuitive, it might be nice for
the default behavior to be setting defltuid to the uid specified in
access when that access option is used.  This can be overridden with
the dfltuid option, but I think it makes more sense to attach as the
uid you are restricting access to.

If that's the way we want to go, I think that can be handled in a
separate patch.

I've merged this stuff into my test tree, as soon as regressions pass
and I confirm they compile clean on another architecture I'll push
them into my devel to be picked up by -mm.

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 9p: rename uid and gid parameters

2007-09-13 Thread Eric Van Hensbergen

On 9/12/07, Latchesar Ionkov <[EMAIL PROTECTED]> wrote:
> Change the names of 'uid' and 'gid' parameters to the more appropriate
> 'dfltuid' and 'dfltgid'.
>

...

> strcpy(v9ses->name, V9FS_DEFUSER);
> strcpy(v9ses->remotename, V9FS_DEFANAME);
> +   v9ses->dfltuid = V9FS_DEFUID;
> +   v9ses->dfltgid = V9FS_DEFGID;
>
...
> +#define V9FS_DEFUID(0)
> +#define V9FS_DEFGID(0)

I'm not sure if there is a good solution here, but I'm uncomfortable
with using uid=0 as the default.  I'm not sure if there is a default
uid for nobody, but anything is probably better than 0.  Looks like
nfsnobody is 65534, we could use that - even if only as a marker for
the server to map it to nobody on the target system?  What do you
think?

Particularly with attach-per-user, we probably need to look at
interacting with idmapd or create our own variant real soon.

  -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V9fs-developer] [PATCH] 9p: attach-per-user

2007-09-11 Thread Eric Van Hensbergen

On 9/3/07, Latchesar Ionkov <[EMAIL PROTECTED]> wrote:
>
> This patch improves the 9P2000 support by allowing every user to attach
> separately. The patch defines three modes of access (new mount option
> 'access'):
>

nit picks:
   * you added/changed options without updated Documentation/filesystems/9p.txt
   * you changed v9fs->extended to be part of a flags structure, that
should have been
   a separate patch
   * rename of options should have been done in a separate patch

> - attach-per-user (access=user)
>  If a user tries to access a file served by v9fs for the first time, v9fs
>  sends an attach command to the server (Tattach) specifying the user. If
>  the attach succeeds, the user can access the v9fs tree.
>  As there is no uname->uid (string->integer) mapping yet, this mode works
>  only with the 9P2000.u dialect.
>
> - allow only one user to access the tree (access=)
>  Only the user with uid can access the v9fs tree. Other users that attempt
>  to access it will get EPERM error.
>
> - do all operations as a single user (access=any)
>  V9fs does a single attach and all operations are done as a single user.
>  If this mode is selected, the v9fs behavior is identical with the current
>  one.
>

Which option is default?

>
>  /**
> - * v9fs_fid_insert - add a fid to a dentry
> + * v9fs_fid_add - add a fid to a dentry
> + * @dentry: dentry that the fid is being added to
>   * @fid: fid to add
> - * @dentry: dentry that it is being added to
>   *
>   */
>
> @@ -66,52 +67,144 @@ int v9fs_fid_add(struct dentry *dentry, struct p9_fid 
> *fid)
>  }

Even small cleanups like this should probably be confined to a
separate patch if they are unrelated.

>
> -struct p9_fid *v9fs_fid_lookup(struct dentry *dentry)
> +static struct p9_fid *v9fs_fid_find(struct dentry *dentry, u32 uid, int any)
>  {
...
>
> -struct p9_fid *v9fs_fid_clone(struct dentry *dentry)
> +struct p9_fid *v9fs_fid_lookup(struct dentry *dentry)
>  {
...
> +
> +struct p9_fid *v9fs_fid_clone(struct dentry *dentry)
> +{

The way the patch got formatted, these look like compulsive
renames..but there's an added function and then changes to the other
two.  I think it might be because of the way you ordered the
functions.  Put new functions after the old functions and maybe this
won't happen.  And clone seems to have lost his function header.  The
code is pretty inconsistent about those these days, but I'd like to do
an audit soon and make sure we have proper comment blocks where
appropriate.

scripts/checkpatch.pl reports:

ERROR: need a space before the open parenthesis '('
#244: FILE: fs/9p/fid.c:147:
+   for(ds = dentry; !IS_ROOT(ds); ds = ds->d_parent)

ERROR: need a space before the open parenthesis '('
#275: FILE: fs/9p/fid.c:178:
+   for(d = dentry, i = n; i >= 0; i--, d = d->d_parent)

Please fix up these small bits and resubmit.

 -eric


Also, go ahead and cc: me directly on patches, for some reason this
one missed my normal filters and got lost.  If I'm directly cc:'d it
will pop to the top of my inbox.

   -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Lguest] [RFC] 9p Virtualization Transports

2007-09-03 Thread Eric Van Hensbergen

On 9/1/07, Rusty Russell <[EMAIL PROTECTED]> wrote:
> On Tue, 2007-08-28 at 13:52 -0500, Eric Van Hensbergen wrote:
> > The lguest and kvm transports are functional, but we are still working out
> > remaining bugs and need to spend some time focusing on performance issues.
> > I wanted to send out this "preview" patch set to the community to solicit
> > ideas on things we can do differently/better.
>
> Patches look reasonable, but just a heads-up: lguest will be moving to
> virtio, as will kvm.  That means a single implementation for both
> (yay!), but it does complicate your life in the short term 8(
>
> Dor has published a kvm virtio implementation, and we've already
> discussed a couple of modifications.  I expect that to be nailed in the
> next 2 weeks tho, and lguest will follow.
>

yeah, I've been emailing Dor -- it sounds like he'll have stuff ready
for the 2.6.24 merge window -- that being the case, I'll write a
virtio transport and mothball the PCI and lguest transports.  They
were straightforward to write (a couple hours for the lguest
transport) and the lguest transport was a good learning experience --
so I'm not shedding tears over wasted effort.

-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Lguest] [PATCH] modify lguest console to support multiple hvc's

2007-08-31 Thread Eric Van Hensbergen

On 8/30/07, Rusty Russell <[EMAIL PROTECTED]> wrote:
> On Thu, 2007-08-30 at 13:38 -0500, Eric Van Hensbergen wrote:
> > From: Eric Van Hensbergen <[EMAIL PROTECTED]>
> >
> > This was a quick modification I did of lguest to be able to support multiple
> > HVC channels for some experiments I was doing.  I'm not sure if this is more
> > generally useful, so I'm posting it to the list in case someone else has a
> > need for it.
> >
> > Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
>
> This is cool!  Great that it's useful for you.  What do the other
> consoles look like from inside the guest?
>

They just show up on other hvc device minor numbers.  I was running 9p
over them, but I wanted a tighter coupling for v9fs so I could tune
performance and incrementally optimize.

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] modify lguest console to support multiple hvc's

2007-08-30 Thread Eric Van Hensbergen

From: Eric Van Hensbergen <[EMAIL PROTECTED]>

This was a quick modification I did of lguest to be able to support multiple
HVC channels for some experiements I was doing.  I'm not sure if this is more
generally useful, so I'm posting it to the list in case someone else has a
need for it.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 Documentation/lguest/lguest.c |  161 -
 drivers/char/hvc_lguest.c |   57 +--
 2 files changed, 129 insertions(+), 89 deletions(-)

diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c
index f791840..c6a3e4d 100644
--- a/Documentation/lguest/lguest.c
+++ b/Documentation/lguest/lguest.c
@@ -690,12 +690,14 @@ static void restore_term(void)
 }
 
 /* We associate some data with the console for our exit hack. */
-struct console_abort
+struct console_priv
 {
+   /* which console we are */
+   int index;
/* How many times have they hit ^C? */
-   int count;
+   int a_count;
/* When did they start? */
-   struct timeval start;
+   struct timeval a_start;
 };
 
 /* This is the routine which handles console input (ie. stdin). */
@@ -705,11 +707,12 @@ static bool handle_console_input(int fd, struct device 
*dev)
int len;
unsigned int num;
struct iovec iov[LGUEST_MAX_DMA_SECTIONS];
-   struct console_abort *abort = dev->priv;
+   struct console_priv *cons = dev->priv;
 
/* First we get the console buffer from the Guest.  The key is dev->mem
-* which was set to 0 in setup_console(). */
-   lenp = get_dma_buffer(fd, dev->mem, iov, &num, &irq);
+* plus the console index adjusted to be a multiple of 4 because lguest
+* wants keys to be a multiple of 4 */
+   lenp = get_dma_buffer(fd, dev->mem+(cons->index*4), iov, &num, &irq);
if (!lenp) {
/* If it's not ready for input, warn and set up to discard. */
warn("console: no dma buffer!");
@@ -734,39 +737,44 @@ static bool handle_console_input(int fd, struct device 
*dev)
trigger_irq(fd, irq);
}
 
-   /* Three ^C within one second?  Exit.
-*
-* This is such a hack, but works surprisingly well.  Each ^C has to be
-* in a buffer by itself, so they can't be too fast.  But we check that
-* we get three within about a second, so they can't be too slow. */
-   if (len == 1 && ((char *)iov[0].iov_base)[0] == 3) {
-   if (!abort->count++)
-   gettimeofday(&abort->start, NULL);
-   else if (abort->count == 3) {
-   struct timeval now;
-   gettimeofday(&now, NULL);
-   if (now.tv_sec <= abort->start.tv_sec+1) {
-   u32 args[] = { LHREQ_BREAK, 0 };
-   /* Close the fd so Waker will know it has to
-* exit. */
-   close(waker_fd);
-   /* Just in case waker is blocked in BREAK, send
-* unbreak now. */
-   write(fd, args, sizeof(args));
-   exit(2);
+   /* Only do interrupt hack and restore_term() on initial console */
+   if (cons->index == 0) {
+   /* Three ^C within one second?  Exit.
+*
+* This is such a hack, but works surprisingly well.  Each ^C
+* has to be in a buffer by itself, so they can't be too fast.
+* But we check that we get three within about a second, so
+* they can't be too slow. */
+   if (len == 1 && ((char *)iov[0].iov_base)[0] == 3) {
+   if (!cons->a_count++)
+   gettimeofday(&cons->a_start, NULL);
+   else if (cons->a_count == 3) {
+   struct timeval now;
+   gettimeofday(&now, NULL);
+   if (now.tv_sec <= cons->a_start.tv_sec+1) {
+   u32 args[] = { LHREQ_BREAK, 0 };
+   /* Close the fd so Waker will know it
+* has to exit. */
+   close(waker_fd);
+   /* Just in case waker is blocked in
+* BREAK, send unbreak now. */
+   write(fd, args, sizeof(args));
+   exit(2);
+   }
+   cons->a_count = 0;
}
-

Re: [kvm-devel] [RFC] 9p: add KVM/QEMU pci transport

2007-08-28 Thread Eric Van Hensbergen

On 8/28/07, Arnd Bergmann <[EMAIL PROTECTED]> wrote:
> On Tuesday 28 August 2007, Eric Van Hensbergen wrote:
>
> > This adds a shared memory transport for a synthetic 9p device for
> > paravirtualized file system support under KVM/QEMU.
>
> Nice driver. I'm hoping we can do a virtio driver using a similar
> concept.
>

Yes.  I'm looking at the patches from Dor now, it should be pretty
straight forward.  The PCI is interesting in its own right for other
(non-virtual) projects we've been playing with

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC] 9p: add KVM/QEMU pci transport

2007-08-28 Thread Eric Van Hensbergen

From: Latchesar Ionkov <[EMAIL PROTECTED]>

This adds a shared memory transport for a synthetic 9p device for
paravirtualized file system support under KVM/QEMU.

Signed-off-by: Latchesar Ionkov <[EMAIL PROTECTED]>
Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 Documentation/filesystems/9p.txt |2 +
 net/9p/Kconfig   |   10 ++-
 net/9p/Makefile  |4 +
 net/9p/trans_pci.c   |  295 ++
 4 files changed, 310 insertions(+), 1 deletions(-)
 create mode 100644 net/9p/trans_pci.c

diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index 1a5f50d..e1879bd 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -46,6 +46,8 @@ OPTIONS
tcp  - specifying a normal TCP/IP connection
fd   - used passed file descriptors for connection
 (see rfdno and wfdno)
+   pci  - use a PCI pseudo device for 9p communication
+   over shared memory between a guest and host
 
   uname=name   user name to attempt mount as on the remote server.  The
server may override or ignore this value.  Certain user
diff --git a/net/9p/Kconfig b/net/9p/Kconfig
index 09566ae..8517560 100644
--- a/net/9p/Kconfig
+++ b/net/9p/Kconfig
@@ -16,13 +16,21 @@ menuconfig NET_9P
 config NET_9P_FD
depends on NET_9P
default y if NET_9P
-   tristate "9P File Descriptor Transports (Experimental)"
+   tristate "9p File Descriptor Transports (Experimental)"
help
  This builds support for file descriptor transports for 9p
  which includes support for TCP/IP, named pipes, or passed
  file descriptors.  TCP/IP is the default transport for 9p,
  so if you are going to use 9p, you'll likely want this.  
 
+config NET_9P_PCI
+   depends on NET_9P
+   tristate "9p PCI Shared Memory Transport (Experimental)"
+   help
+ This builds support for a PCI psuedo-device currently available
+ under KVM/QEMU which allows for 9p transactions over shared
+ memory between the guest and the host.
+
 config NET_9P_DEBUG
bool "Debug information"
depends on NET_9P
diff --git a/net/9p/Makefile b/net/9p/Makefile
index 7b2a67a..26ce89d 100644
--- a/net/9p/Makefile
+++ b/net/9p/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_NET_9P) := 9pnet.o
 obj-$(CONFIG_NET_9P_FD) += 9pnet_fd.o
+obj-$(CONFIG_NET_9P_PCI) += 9pnet_pci.o
 
 9pnet-objs := \
mod.o \
@@ -14,3 +15,6 @@ obj-$(CONFIG_NET_9P_FD) += 9pnet_fd.o
 
 9pnet_fd-objs := \
trans_fd.o \
+
+9pnet_pci-objs := \
+   trans_pci.o \
diff --git a/net/9p/trans_pci.c b/net/9p/trans_pci.c
new file mode 100644
index 000..36ddc5f
--- /dev/null
+++ b/net/9p/trans_pci.c
@@ -0,0 +1,295 @@
+/*
+ * net/9p/trans_pci.c
+ *
+ * 9P over PCI transport layer. For use with KVM/QEMU.
+ *
+ *  Copyright (C) 2007 by Latchesar Ionkov <[EMAIL PROTECTED]>
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License version 2
+ *  as published by the Free Software Foundation.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to:
+ *  Free Software Foundation
+ *  51 Franklin Street, Fifth Floor
+ *  Boston, MA  02111-1301  USA
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define P9PCI_DRIVER_NAME "9P PCI Device"
+#define P9PCI_DRIVER_VERSION "1"
+
+#define PCI_VENDOR_ID_9P   0x5002
+#define PCI_DEVICE_ID_9P   0x000D
+
+#define MAX_PCI_BUF(4*1024) /* TODO: Get a number from lucho */
+
+struct p9pci_trans {
+   struct pci_dev  *pdev;
+   void __iomem*ioaddr;
+   void __iomem*tx;
+   void __iomem*rx;
+   int irq;
+   int pos;
+   int len;
+   wait_queue_head_t   wait;
+};
+static struct p9pci_trans *p9pci_trans; /* single channel for now */
+
+static struct pci_device_id p9pci_tbl[] = {
+   {PCI_VENDOR_ID_9P, PCI_DEVICE_ID_9P, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
+   {0,}
+};
+
+static irqreturn_t p9pci_interrupt(int irq, void *dev)
+{
+   p9pci_trans = dev;
+   p9pci_trans->len = le32_to_cpu(readl(p9pci_trans->rx));
+   P9_DPRINTK(P9_DEBUG_TRANS, "%p le

[REFERENCE ONLY] 9p: add shared memory transport

2007-08-28 Thread Eric Van Hensbergen

From: Eric Van Hensbergen <[EMAIL PROTECTED](none)>

This adds a 9p generic shared memory transport which has been used to
communicate between Dom0 and DomU under Xen as part of the Libra and PROSE
projects (http://www.research.ibm.com/prose).

Parts of the code are a horrible hack, but may be useful as reference
for constructing (or how not to construct) a poll-driven shared-memory driver
for Xen (or other purposes).

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 net/9p/Kconfig |7 +
 net/9p/Makefile|4 +
 net/9p/trans_shm.c |  378 
 3 files changed, 389 insertions(+), 0 deletions(-)
 create mode 100644 net/9p/trans_shm.c

diff --git a/net/9p/Kconfig b/net/9p/Kconfig
index fab7bb9..a1b55e8 100644
--- a/net/9p/Kconfig
+++ b/net/9p/Kconfig
@@ -38,6 +38,13 @@ config NET_9P_LG
  This builds support for a transport between an Lguest
  guest partition and the host partition.
 
+config NET_9P_SHM
+   depends on NET_9P
+   tristate "9p Shared Memory Transport (Experimental)"
+   help
+ This builds support for a shared memory transport which
+ can be used on XenPPC to mount 9p between DomU and Dom0.
+
 config NET_9P_DEBUG
bool "Debug information"
depends on NET_9P
diff --git a/net/9p/Makefile b/net/9p/Makefile
index 80a4227..e7a036a 100644
--- a/net/9p/Makefile
+++ b/net/9p/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_NET_9P) := 9pnet.o
 obj-$(CONFIG_NET_9P_FD) += 9pnet_fd.o
 obj-$(CONFIG_NET_9P_PCI) += 9pnet_pci.o
 obj-$(CONFIG_NET_9P_LG) += 9pnet_lg.o
+obj-$(CONFIG_NET_9P_SHM) += 9pnet_shm.o
 
 9pnet-objs := \
mod.o \
@@ -22,3 +23,6 @@ obj-$(CONFIG_NET_9P_LG) += 9pnet_lg.o
 
 9pnet_lg-objs := \
trans_lg.o \
+
+9pnet_shm-objs := \
+   trans_shm.o \
diff --git a/net/9p/trans_shm.c b/net/9p/trans_shm.c
new file mode 100644
index 000..d7847fd
--- /dev/null
+++ b/net/9p/trans_shm.c
@@ -0,0 +1,378 @@
+/*
+ * linux/fs/9p/trans_shm.c
+ *
+ *  Shared memory transport layer.
+ *
+ *  This is the Linux version of shared memory transport hack used
+ *  in the Libra and PROSE projects to communicate between Dom0 and
+ *  DomU under Xen and rHype.
+ *
+ *  Certain aspects of this code (such as the BIG_UGLY_BUFFER) are
+ *  horrible hacks, but the rest of the code may provide a decent starting
+ *  point for someone wanting to write a proper shared-memory transport for
+ *  Xen (or other purposes).
+ *
+ *  The server side of this transport exists in inferno-tx branch of
+ *  inferno.  It can be grabbed from the txinferno branch of
+ *  http://git.9grid.us/git/inferno.git
+ *
+ *  Copyright (C) 2006,2007 by Eric Van Hensbergen <[EMAIL PROTECTED]>
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License version 2
+ *  as published by the Free Software Foundation.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to:
+ *  Free Software Foundation
+ *  51 Franklin Street, Fifth Floor
+ *  Boston, MA  02111-1301  USA
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+enum
+{
+   Shm_Idle =  0,
+   Shm_Announcing =1,
+   Shm_Announced = 2,
+   Shm_Connecting =3,
+   Shm_Connected = 4,
+   Shm_Hungup =5,
+
+   Shmaddrlen =255,
+};
+
+enum
+{
+   S_USM = 1,  /* Sys V shared memory */
+   S_MSM = 2,  /* mmap */
+   S_XEN = 3,  /* xen shared memory */
+
+   SM_SERVER = 0,
+   SM_CLIENT = 1,
+
+   DATA_POLL = 100,
+   HANDSHAKE_POLL =1
+};
+
+struct chan
+{
+   u32 magic;
+   u32 write;
+   u32 read;
+   u32 overflow;
+};
+
+enum {
+   Chan_listen,
+   Chan_connected,
+   Chan_hungup
+};
+
+/* Two circular buffers: small one for input, large one for output. */
+struct chan_pipe
+{
+   u32 magic;
+   u32 buflen;
+   u32 state;
+   struct chan out;
+   struct chan in;
+   char buffers[0];
+};
+
+#define CHUNK_SIZE (64<<20)
+#define CHAN_MAGIC 0xB0BABEEF
+#define CHAN_BUF_MAGIC 0xCAFEBABE
+
+/*
+ * UGLY HACK: static buffer just like in libOS so we can easily
+ *address things.  Xen hackers free to fix this.
+ *
+ */
+
+#define BIG_UGLY_BUFFER_SZ 8*1024
+static char big_ugly_buffer[sizeof(struct chan_pipe)+(BIG_UGLY_BUFFER_SZ*2)];
+
+/*
+ * (expr) may be as

[RFC] 9p: add lguest transport

2007-08-28 Thread Eric Van Hensbergen

From: Eric Van Hensbergen <[EMAIL PROTECTED](none)>

This adds a transport to 9p for communicating between guest and host
domains on lguest.  Currently, the host-side proxies the communication to a
socket connected to the actual server.  The transport is based heavily on
the existing console code.

A better integrated server component which eliminates some of the copy
overhead is in progress and will look less like the existing console code.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 Documentation/filesystems/9p.txt |2 +
 Documentation/lguest/lguest.c|  127 
 fs/9p/v9fs.c |2 +-
 include/linux/lguest_launcher.h  |1 +
 net/9p/Kconfig   |7 +
 net/9p/Makefile  |4 +
 net/9p/trans_lg.c|  303 ++
 7 files changed, 445 insertions(+), 1 deletions(-)
 create mode 100644 net/9p/trans_lg.c

diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index e1879bd..1a3342f 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -48,6 +48,8 @@ OPTIONS
 (see rfdno and wfdno)
pci  - use a PCI pseudo device for 9p communication
over shared memory between a guest and host
+   lg   - use a lguest 9p channel for communication
+   over shared memory between a guest and host
 
   uname=name   user name to attempt mount as on the remote server.  The
server may override or ignore this value.  Certain user
diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c
index f791840..adc50de 100644
--- a/Documentation/lguest/lguest.c
+++ b/Documentation/lguest/lguest.c
@@ -1318,6 +1318,128 @@ static void setup_tun_net(const char *arg, struct 
device_list *devices)
 }
 /* That's the end of device setup. */
 
+/* 9p transport code.
+ * This code implements the host side of the 9p transport.  Right now
+ * this is heavily based on the console code and just proxies data to
+ * a socket connected to an external server.  Eventually we'll hook the
+ * server code in more directly like we do with lguest to avoid the
+ * socket overhead.
+ */
+/* This is the routine proxies 9p channel input */
+static bool handle_9p_input(int fd, struct device *dev)
+{
+   u32 irq = 0;
+   u32 *lenp;
+   int len = 0;
+   unsigned int num = 0;
+   struct iovec iov[LGUEST_MAX_DMA_SECTIONS];
+
+   /* First we get the console buffer from the Guest.  The key is dev->mem
+* which was set in setup_9p(). */
+
+   lenp = get_dma_buffer(fd, dev->mem, iov, &num, &irq);
+   if (!lenp) {
+   /* If it's not ready for input, warn and set up to discard. */
+   warn("9p: no dma buffer!");
+   discard_iovec(iov, &num);
+   }
+
+   /* This is why we convert to iovecs: the readv() call uses them, and so
+* it reads straight into the Guest's buffer. */
+   len = readv(dev->fd, iov, num);
+   if (len == 0) {
+   /*
+* BUG: When using msize > 1k we get zero length reads
+* and I'm not sure why.
+*/
+   err(1, "9p: zero length read!");
+   }
+
+   if (len < 0) /* Something has gone horribly wrong */
+   errx(1, "9p: input readv returned %d", len);
+
+   /* If we read the data into the Guest, fill in the length and send the
+* interrupt. */
+   if (lenp) {
+   *lenp = len;
+   trigger_irq(fd, irq);
+   }
+
+   /* Now, if we didn't read anything, return failure */
+   if (!len)
+   return false;
+
+   /* Everything went OK! */
+   return true;
+}
+
+/*  Proxy output to socket. */
+static u32 handle_9p_output(int fd, const struct iovec *iov,
+unsigned num, struct device*dev)
+{
+   /* Whatever the Guest sends, write it to the fd.  Return the
+* number of bytes written. */
+   return writev(dev->fd, iov, num);
+}
+
+/* Connect to 9p server (stolen from spfsclient by Lucho Ionkov) */
+/* We can't use gethostbyname because it makes us link a shared library */
+static int connect_9p(const char *arg)
+{
+   int fd, port;
+   char *addr, *p, *s;
+   struct sockaddr_in saddr;
+   u32 ipaddr;
+
+   if (!arg)
+   err(1, "9p: problem with args");
+
+   addr = strdup(arg);
+   ipaddr = str2ip(addr);
+
+   port = 567;
+   p = strrchr(addr, ':');
+   if (p) {
+   *p = '\0';
+   p++;
+   port = strtol(p, &s, 10);
+   if (*s != '\0')
+   err(1, "9p

[RFC] 9p: Make transports dynamic

2007-08-28 Thread Eric Van Hensbergen

From: Eric Van Hensbergen <[EMAIL PROTECTED](none)>

This patch abstracts out the interfaces to underlying transports so that
new transports can be added as modules.  This should also allow kernel
configuration of transports without ifdef-hell.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 Documentation/filesystems/9p.txt |8 +-
 fs/9p/v9fs.c |  149 +++---
 fs/9p/v9fs.h |   15 +--
 fs/9p/vfs_super.c|   19 +--
 include/net/9p/client.h  |4 +-
 include/net/9p/conn.h|4 +-
 include/net/9p/transport.h   |   25 ++-
 net/9p/Kconfig   |   10 +
 net/9p/Makefile  |5 +-
 net/9p/client.c  |2 +-
 net/9p/mux.c |4 +-
 net/9p/trans_fd.c|  419 --
 12 files changed, 379 insertions(+), 285 deletions(-)

diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index cda6905..1a5f50d 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -35,12 +35,12 @@ For remote file server:
 
 For Plan 9 From User Space applications (http://swtch.com/plan9)
 
-   mount -t 9p `namespace`/acme /mnt/9 -o proto=unix,uname=$USER
+   mount -t 9p `namespace`/acme /mnt/9 -o trans=unix,uname=$USER
 
 OPTIONS
 ===
 
-  proto=name   select an alternative transport.  Valid options are
+  trans=name   select an alternative transport.  Valid options are
currently:
unix - specifying a named pipe mount point
tcp  - specifying a normal TCP/IP connection
@@ -68,9 +68,9 @@ OPTIONS
0x40 = display transport debug
0x80 = display allocation debug
 
-  rfdno=n  the file descriptor for reading with proto=fd
+  rfdno=n  the file descriptor for reading with trans=fd
 
-  wfdno=n  the file descriptor for writing with proto=fd
+  wfdno=n  the file descriptor for writing with trans=fd
 
   maxdata=nthe number of bytes to use for 9p packet payload (msize)
 
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index 0a7068e..08d880f 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -37,18 +37,58 @@
 #include "v9fs_vfs.h"
 
 /*
+ * Dynamic Transport Registration Routines
+ *
+ */
+
+static LIST_HEAD(v9fs_trans_list);
+static struct p9_trans_module *v9fs_default_trans;
+
+/**
+ * v9fs_register_trans - register a new transport with 9p
+ * @m - structure describing the transport module and entry points
+ *
+ */
+void v9fs_register_trans(struct p9_trans_module *m)
+{
+   list_add_tail(&m->list, &v9fs_trans_list);
+   if (m->def)
+   v9fs_default_trans = m;
+}
+EXPORT_SYMBOL(v9fs_register_trans);
+
+/**
+ * v9fs_match_trans - match transport versus registered transports
+ * @arg: string identifying transport
+ *
+ */
+static struct p9_trans_module *v9fs_match_trans(const substring_t *name)
+{
+   struct list_head *p;
+   struct p9_trans_module *t = NULL;
+
+   list_for_each(p, &v9fs_trans_list) {
+   t = list_entry(p, struct p9_trans_module, list);
+   if (strncmp(t->name, name->from, name->to-name->from) == 0) {
+   P9_DPRINTK(P9_DEBUG_TRANS, "trans=%s\n", t->name);
+   break;
+   }
+   }
+   return t;
+}
+
+/*
   * Option Parsing (code inspired by NFS code)
-  *
+  *  NOTE: each transport will parse its own options
   */
 
 enum {
/* Options that take integer arguments */
-   Opt_debug, Opt_port, Opt_msize, Opt_uid, Opt_gid, Opt_afid,
-   Opt_rfdno, Opt_wfdno,
+   Opt_debug, Opt_msize, Opt_uid, Opt_gid, Opt_afid,
/* String options */
-   Opt_uname, Opt_remotename,
+   Opt_uname, Opt_remotename, Opt_trans,
/* Options that take no arguments */
-   Opt_legacy, Opt_nodevmap, Opt_unix, Opt_tcp, Opt_fd, Opt_pci,
+   Opt_legacy, Opt_nodevmap,
/* Cache options */
Opt_cache_loose,
/* Error token */
@@ -57,24 +97,13 @@ enum {
 
 static match_table_t tokens = {
{Opt_debug, "debug=%x"},
-   {Opt_port, "port=%u"},
{Opt_msize, "msize=%u"},
{Opt_uid, "uid=%u"},
{Opt_gid, "gid=%u"},
{Opt_afid, "afid=%u"},
-   {Opt_rfdno, "rfdno=%u"},
-   {Opt_wfdno, "wfdno=%u"},
{Opt_uname, "uname=%s"},
{Opt_remotename, "aname=%s"},
-   {Opt_unix, "proto=unix"},
-   {Opt_tcp, "proto=tcp"},
-   {Opt_fd, "proto=fd"},
-#ifdef CONFIG_PCI_9P
-   {Opt_pci, "proto=pci"},
-#endif
-   {Opt_tcp, "tcp"},
-   {Opt_unix, "unix"},
-   {Opt_fd, "fd"},
+   {Opt_trans,

[RFC] 9p Virtualization Transports

2007-08-28 Thread Eric Van Hensbergen

This patch set contains a set of virtualization transports for the 9p file
system intended to provide a mechanism for guests to access a portion of the
hosts name space without having to go through a virtualized network.  

Shared memory based transports are provided for lguest using a variation of 
the lguest console code and for KVM using a synthetic PCI device.  The patches
to the qemu portion of the latter will be posted to the kvm-devel list later
today.  

Also provided is a much older hack implementation which was used on XenPPC to 
communicated between Dom0 and DomU as part of the PROSE 
(http://www.research.ibm.com/prose) and Libra projects.  It is not our intent
to push the Xen shared memory transport into the kernel, but we are providing 
it in this patch-set for historical reference.

The lguest and kvm transports are functional, but we are still working out
remaining bugs and need to spend some time focusing on performance issues.
I wanted to send out this "preview" patch set to the community to solicit
ideas on things we can do differently/better.

   -eric


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p fixes

2007-08-23 Thread Eric Van Hensbergen

Linus, please pull from the 'for-linus' branch of:
 git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git/ for-linus

This tree contains the following:

Mariusz Kozlowski (1):
9p: fix bad error path in conversion routines

Adrian Bunk(2):
9p: remove deprecated v9fs_fid_lookup_remove
9p: fix use after free

Eric Van Hensbergen(1):
9p: update maintainers and documentation

 Documentation/filesystems/9p.txt |   24 +++-
 MAINTAINERS  |5 ++---
 fs/9p/fid.c  |   17 -
 fs/9p/fid.h  |2 --
 net/9p/conv.c|2 +-
 net/9p/mux.c |   10 ++
 6 files changed, 28 insertions(+), 32 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.20.17 review 32/58] fs: 9p/conv.c error path fix

2007-08-22 Thread Eric Van Hensbergen

On 8/22/07, Willy Tarreau <[EMAIL PROTECTED]> wrote:
> When buf_check_overflow() returns != 0 we will hit kfree(ERR_PTR(err))
> and it will not be happy about it.
>
> Signed-off-by: Mariusz Kozlowski <[EMAIL PROTECTED]>
> Cc: Andrew Morton <[EMAIL PROTECTED]>
> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>
> Signed-off-by: Willy Tarreau <[EMAIL PROTECTED]>
Acked-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

This seems to be in the current code as well, I'll forward-port the patch...

-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] #if 0 v9fs_fid_lookup_remove()

2007-08-14 Thread Eric Van Hensbergen

Sorry- its in my merge queue, but I've been fighting other fires.
Will try and get this regression tested and merged into v9fs-devel
tomorrow afternoon along with a few other patches.

-eric


On 8/14/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> This patch #if 0's the unused v9fs_fid_lookup_remove().
>
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
>
> ---
>
> This patch has been sent on:
> - 29 Jul 2007
>
>  fs/9p/fid.c |2 ++
>  fs/9p/fid.h |1 -
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> --- linux-2.6.23-rc1-mm1/fs/9p/fid.h.old2007-07-26 13:22:00.0 
> +0200
> +++ linux-2.6.23-rc1-mm1/fs/9p/fid.h2007-07-26 13:22:07.0 +0200
> @@ -28,6 +28,5 @@
>  };
>
>  struct p9_fid *v9fs_fid_lookup(struct dentry *dentry);
> -struct p9_fid *v9fs_fid_lookup_remove(struct dentry *dentry);
>  struct p9_fid *v9fs_fid_clone(struct dentry *dentry);
>  int v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid);
> --- linux-2.6.23-rc1-mm1/fs/9p/fid.c.old2007-07-26 13:22:22.0 
> +0200
> +++ linux-2.6.23-rc1-mm1/fs/9p/fid.c2007-07-26 13:22:40.0 +0200
> @@ -92,6 +92,7 @@
> return fid;
>  }
>
> +#if 0
>  struct p9_fid *v9fs_fid_lookup_remove(struct dentry *dentry)
>  {
> struct p9_fid *fid;
> @@ -107,6 +108,7 @@
>
> return fid;
>  }
> +#endif  /*  0  */
>
>
>  /**
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: net/9p/mux.c: use-after-free

2007-07-26 Thread Eric Van Hensbergen


On 7/25/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Wed, 25 Jul 2007 13:43:16 -0500 "Eric Van Hensbergen" <[EMAIL PROTECTED]> 
wrote:

> mtmp = ERR_PTR(PTR_ERR(m->tagpool));

odd.  What does ERR_PTR(PTR_ERR(...)) do?



I kind of assumed it was a necessry evil to get the casting right.  A
quick grep shows it in 42 other places within the kernel.  Unpacking
the macros it looks like:

  (void *)(long)(struct p9_idpool *)

So all that you would really need is (void *) or ERR_PTR -- but that
might look confusing in the code.  Of course, broadening the context a
bit:

   m->tagpool = p9_idpool_create();
   if (!m->tagpool) {
   mtmp = ERR_PTR(PTR_ERR(m->tagpool));
   kfree(m);
   return mtmp;
   }

m->tagpool must be zero to enter the code at all, so we are returning
a NULL pointer, not really an error -- which is probably wrong (I
don't think it will properly trigger IS_ERR_VALUE) -- so we should
probably be returning -ENOMEM.

Of course, we really should be seeing an ERR_PTR returned from
p9_idpool_create, not 0 -- checking that code, it either returns
-ENOMEM or the correct value, never 0, so the check is wrong as well.
It should be:

   m->tagpool = p9_idpool_create();
   if (IS_ERR(m->tagpool)) {
   mtmp = ERR_PTR(-ENOMEM);
   kfree(m);
   return mtmp;
   }

We could have done:
  ERR_PTR(m->tagpool);
or kept the long:
  ERR_PTR(PTR_ERR(m->tagpool));
but I think returning an explicit error code keeps the code more clear.

So, which is the correct approach?

  -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: net/9p/mux.c: use-after-free

2007-07-25 Thread Eric Van Hensbergen


On 7/25/07, Latchesar Ionkov <[EMAIL PROTECTED]> wrote:

Yep, it's a leak.



Okay, I'll roll that into the patch as well.

  -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: net/9p/mux.c: use-after-free

2007-07-25 Thread Eric Van Hensbergen


On 7/22/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:

The Coverity checker spotted the following use-after-free
in net/9p/mux.c:

<--  snip  -->

...
struct p9_conn *p9_conn_create(struct p9_transport *trans, int msize,
unsigned char *extended)
{
...
if (!m->tagpool) {
kfree(m);
return ERR_PTR(PTR_ERR(m->tagpool));
}
...

<--  snip  -->



I've got a fix for this one:
   if (!m->tagpool) {
   mtmp = ERR_PTR(PTR_ERR(m->tagpool));
   kfree(m);
   return mtmp;
   }

but I was wondering about one of the other returns further down the function:

...
   memset(&m->poll_waddr, 0, sizeof(m->poll_waddr));
   m->poll_task = NULL;
   n = p9_mux_poll_start(m);
   if (n)
   return ERR_PTR(n);

   n = trans->poll(trans, &m->pt);
...

lucho: doesn't that constitute a leak?  Shouldn't we be doing:

   if (n) {
   kfree(m);
   return ERR_PTR(n);
   }

-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CTL_UNNUMBERED (Re: [PATCH] 9p: Don't use binary sysctl numbers.)

2007-07-23 Thread Eric Van Hensbergen


On 7/23/07, Latchesar Ionkov <[EMAIL PROTECTED]> wrote:

It doesn't really matter (for me) whether it is sysctl or sysfs
interface. The sysctl approach seemed easier to implement. If the
consensus is to use sysfs, I'll send a patch (for 2.6.24).

Sorry for the incorrect implementation, I guess I stole the code from
unappropriate place :)



I think this is appropriate for a "fix" submission to the 2.6.23-rc
series.  If you don't have the bandwidth right now, I'll look at
reworking it, or at the very least just removing the sysctl interface.

-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CTL_UNNUMBERED (Re: [PATCH] 9p: Don't use binary sysctl numbers.)

2007-07-23 Thread Eric Van Hensbergen

On 7/21/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

Alexey Dobriyan <[EMAIL PROTECTED]> writes:

>
> That's separate patch but CTL_UNNUMBERED must die, because it's totally
> unneeded. If you don't want sysctl(2) interface just SKIP ->ctl_name
> initialization and save one line for something useful.

As for the 9p code it doesn't seem to need or want a real binary
interface.  The 9p debug code picking of a semi-random number and not
patching it into sysctl.h like it should for a binary interface is
an implementation bug, and a maintenance problem.

Now that -rc1 is out, lets talk a bit more about this.  Lucho can you
provide some level of justification of why you went for a sysctl
interface versus something directly accessible within the file system
-- that would seem more on-par with the 9p philosophy.

Perhaps its time for a general cleanup of the debug_level stuff -- it
was always ugly to have it as a global, but there was just no clear
way to have the session structure available everywhere we use it.

  -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Plan 9 Resource Sharing Support - what is it?

2007-07-23 Thread Eric Van Hensbergen


On 7/23/07, Pavel Machek <[EMAIL PROTECTED]> wrote:

Hi!

What is "plan 9 resource sharing"? Some kind of mosix-like process
migration? Could you explain it in two lines in Kconfig?

http://v9fs.sf.net

is redirect to

http://v9fs.sourceforge.net/

which tells me

Moved to SWiK

after clicking on that, I get to page with content, but no
explanation (could Kconfig/MAINTAINERS be updated?).



Sure, I'll try to put something in that is considerably less vague and
I'll update the URLs as well.  We've been somewhat lagging on
documentation.  The most complete explanation is available in the
Freenix paper:
  http://www.usenix.org/events/usenix05/tech/freenix/hensbergen.html

The short answer is that it is a Linux client for sharing file
systems, devices, and system services mapped as synthetic file systems
via a simple protocol.  Its primary focus has been to keep things
simple, and to try and maintain support for being able to effectively
share synthetic file systems (like /proc, or /sysfs, or ones exported
by Plan 9 applications (Russ Cox's Plan 9 ports package contains a set
of Plan 9 applications ported to UNIX such as the Venti content
addressable storage system)).

Its being used internally by (at the very least) IBM Research and Los
Alamos National Labs.  To date we've been focused on the client and a
few specialized servers, we are currently broadening our approach to a
better general-purpose server and evaluating if it makes sense to make
an in-kernel server available.  We've also been looking at using 9p in
the context of virtualized environments to provide file service, and
perhaps sharing of other system resources as well.  Once we have a
better handle on the server story, we'll produce a much larger body of
documentation discussing usage as well as development of 9p file
servers.

There are also side-efforts underway evaluating different methods of
extended the 9p protocol to better support the Linux (and other UNIX)
environments -- ideally without adding a signifigant amount of
complexity.

-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p patch to fix compilation issue

2007-07-17 Thread Eric Van Hensbergen


Linus, please pull from the 'for-linus' branch of:
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git/ for-linus

This tree contains the following:

Dave Jones(1):
 fix debug compilation error

v9fs.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 9p: fix debug compilation error

2007-07-16 Thread Eric Van Hensbergen


On 7/16/07, Dave Jones <[EMAIL PROTECTED]> wrote:

On Mon, Jul 16, 2007 at 09:47:49AM -0500, Eric Van Hensbergen wrote:
 > From: Meelis Roos <[EMAIL PROTECTED]>
 >
 > With 9P but no 9P debug options, this error occurs:
 >   CC [M]  fs/9p/v9fs.o
 > fs/9p/v9fs.c: In function 'v9fs_parse_options':
 > fs/9p/v9fs.c:134: error: 'p9_debug_level' undeclared (first use in this 
function)
 >
 > The following patch moves the definition of p9_debug_level out of #ifdef
 > and seems to fix the problem.
 >
 > (Original patch took care of the extern definition in the includes, but
 > not the actual definition in mod.c - ericvh)

Seems somewhat wasteful to include the debug options when the config
option has been disabled though.
Wouldn't something like this (untested) make more sense ?


Fair enough.  Looks like I introduced this when I put back the
mount-time debug option.



Dave

---

fs/9p/v9fs.c: In function 'v9fs_parse_options':
fs/9p/v9fs.c:134: error: 'p9_debug_level' undeclared (first use in this 
function)

Signed-off-by: Dave Jones <[EMAIL PROTECTED]>

Acked-by: Eric Van Hensbergen <[EMAIL PROTECTED]>



--- linux-2.6.22.noarch/fs/9p/v9fs.c~   2007-07-16 11:45:56.0 -0400
+++ linux-2.6.22.noarch/fs/9p/v9fs.c2007-07-16 11:46:12.0 -0400
@@ -131,7 +131,9 @@ static void v9fs_parse_options(char *opt
switch (token) {
case Opt_debug:
v9ses->debug = option;
+#ifdef CONFIG_NET_9P_DEBUG
p9_debug_level = option;
+#endif
break;
case Opt_port:
v9ses->port = option;

--
http://www.codemonkey.org.uk


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] 9p: fix debug compilation error

2007-07-16 Thread Eric Van Hensbergen

From: Meelis Roos <[EMAIL PROTECTED]>

With 9P but no 9P debug options, this error occurs:
  CC [M]  fs/9p/v9fs.o
fs/9p/v9fs.c: In function 'v9fs_parse_options':
fs/9p/v9fs.c:134: error: 'p9_debug_level' undeclared (first use in this 
function)

The following patch moves the definition of p9_debug_level out of #ifdef
and seems to fix the problem.

(Original patch took care of the extern definition in the includes, but
not the actual definition in mod.c - ericvh)

Signed-off-by: Meelis Roos <[EMAIL PROTECTED]>
Signed-off-by: Eric Van Hensbergren <[EMAIL PROTECTED]>
---
 include/net/9p/9p.h |4 ++--
 net/9p/mod.c|2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/net/9p/9p.h b/include/net/9p/9p.h
index 4d3..8fc3796 100644
--- a/include/net/9p/9p.h
+++ b/include/net/9p/9p.h
@@ -27,6 +27,8 @@
 #ifndef NET_9P_H
 #define NET_9P_H
 
+extern unsigned int p9_debug_level;
+
 #ifdef CONFIG_NET_9P_DEBUG
 
 #define P9_DEBUG_ERROR (1<<0)
@@ -38,8 +40,6 @@
 #define P9_DEBUG_SLABS (1<<7)
 #define P9_DEBUG_FCALL (1<<8)
 
-extern unsigned int p9_debug_level;
-
 #define P9_DPRINTK(level, format, arg...) \
 do {  \
if ((p9_debug_level & level) == level) \
diff --git a/net/9p/mod.c b/net/9p/mod.c
index 4f9e1d2..951bb1d 100644
--- a/net/9p/mod.c
+++ b/net/9p/mod.c
@@ -28,12 +28,10 @@
 #include 
 #include 
 
-#ifdef CONFIG_NET_9P_DEBUG
 unsigned int p9_debug_level = 0;   /* feature-rific global debug level  */
 EXPORT_SYMBOL(p9_debug_level);
 module_param_named(debug, p9_debug_level, uint, 0);
 MODULE_PARM_DESC(debug, "9P debugging level");
-#endif
 
 extern int p9_mux_global_init(void);
 extern void p9_mux_global_exit(void);
-- 
1.5.0.2.gfbe3d-dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [9Pfs] Compilation error

2007-07-16 Thread Eric Van Hensbergen


On 7/16/07, Alejandro Riveira Fernández <[EMAIL PROTECTED]> wrote:


 I get this in today's head 8f41958bdd577731f7411c9605cfaa9db6766809

$ make O=../2.6.23
  Using /home/alex/kernel/linux-2.6 as source for kernel
  GEN /home/alex/kernel/2.6.23/Makefile
  CHK include/linux/version.h
  CHK include/linux/utsrelease.h
  CALL/home/alex/kernel/linux-2.6/scripts/checksyscalls.sh
  CHK include/linux/compile.h
  CC [M]  fs/9p/v9fs.o
/home/alex/kernel/linux-2.6/fs/9p/v9fs.c: En la función 'v9fs_parse_options':
/home/alex/kernel/linux-2.6/fs/9p/v9fs.c:134: error: 'p9_debug_level' no se 
declaró aquí (primer uso en esta función)
/home/alex/kernel/linux-2.6/fs/9p/v9fs.c:134: error: (Cada identificador no 
declarado solamente se reporta una vez
/home/alex/kernel/linux-2.6/fs/9p/v9fs.c:134: error: ara cada funcion en la que 
aparece.)
make[3]: *** [fs/9p/v9fs.o] Error 1
make[2]: *** [fs/9p] Error 2
make[1]: *** [fs] Error 2
make: *** [_all] Error 2



Thanks, someone already submitted a patch, I'm merging and testing now.

-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p Patches for 2.6.23 merge window

2007-07-14 Thread Eric Van Hensbergen


Linus, please pull from the 'for-linus' branch of:
 git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git/ for-linus

This tree contains the following:

Latchesar Ionkov(3):
 Reorganization of 9p file system code
 Change net/9p module name to 9pnet
 Set error to EREMOTEIO if transport write returns zero

Eric Van Hensbergen(3)
 Cache meta-data when loose cache option is set
 Renable mount-time debug option
 Fix a race condition bug in umount which caused segfault

The bulk of the changes were in the reorganization patch which mostly
moved files and interfaces around in preparation for work on an
in-kernel 9p server.

b/fs/9p/Makefile |6
b/fs/9p/fid.c|  168 ++
b/fs/9p/fid.h|   43 -
b/fs/9p/v9fs.c   |  288 ++-
b/fs/9p/v9fs.h   |   32 -
b/fs/9p/v9fs_vfs.h   |6
b/fs/9p/vfs_addr.c   |   57 --
b/fs/9p/vfs_dentry.c |   37 -
b/fs/9p/vfs_dir.c|  155 +-
b/fs/9p/vfs_file.c   |  160 +-
b/fs/9p/vfs_inode.c  |  753 +++---
b/fs/9p/vfs_super.c  |   91 +--
b/fs/Kconfig |2
b/include/net/9p/9p.h|  417 +
b/include/net/9p/client.h|   80 +++
b/include/net/9p/conn.h  |   57 ++
b/include/net/9p/transport.h |   49 ++
b/net/9p/Kconfig |   21
b/net/9p/Makefile|   13
b/net/9p/client.c|  965 +++
b/net/9p/conv.c  |  903 
b/net/9p/error.c |  240 +
b/net/9p/fcprint.c   |  358 ++
b/net/9p/mod.c   |   85 +++
b/net/9p/mux.c   | 1050 +++
b/net/9p/sysctl.c|   86 +++
b/net/9p/trans_fd.c  |  363 ++
b/net/9p/util.c  |  125 +
b/net/Kconfig|1
b/net/Makefile   |2
fs/9p/9p.h   |  375 ---
fs/9p/conv.c |  845 --
fs/9p/conv.h |   50 --
fs/9p/debug.h|   77 ---
fs/9p/error.c|   93 ---
fs/9p/error.h|  177 ---
fs/9p/fcall.c|  427 -
fs/9p/fcprint.c  |  345 --
fs/9p/mux.c  | 1033 --
fs/9p/mux.h  |   55 --
fs/9p/trans_fd.c |  308 
fs/9p/transport.h|   45 -
fs/9p/v9fs.c |7
fs/9p/vfs_file.c |   14
fs/9p/vfs_inode.c|4
fs/9p/vfs_super.c|3
net/9p/Makefile  |7
net/9p/client.c  |7
net/9p/mux.c |7
49 files changed, 5400 insertions(+), 5092 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2/14] Add a new mount flag (MNT_UNION) for union mount

2007-05-15 Thread Eric Van Hensbergen


On 5/15/07, Bharata B Rao <[EMAIL PROTECTED]> wrote:


So there can be two cases in union mounts:
1. A file exists in topmost layer and also in one or more lower layers. Deleting
the file would result in the top layer file being deleted and a whiteout being
created in the top layer.

2. A file exists in one or more of lower layers, but not in the topmost layer.
Deleting this file would result in just a whiteout being created in the
topmost layer.



I'd imagine there is a third potential option, which I'll admit strays
a bit from the conventional UNIX semantic.  If only one layer is
marked as writable, then any changes (including delete) only effect
that layer.  I could imagine this would be useful in situations like
overlaying a sandbox on an otherwise read-only source code tree (you
might want to just get rid of a modification by removing your file and
have it replaced by the original underlying source).

I suppose a further extension would be to have multiple layers marked
as mutable and functions such as delete would effect all mutable
layers, but functions like create would only affect the top mutable
layer.

As an aside, perhaps it would be useful to mark the mutable layer at
mount time (instead of having it always be the top layer).  Again this
could lead to some optional non-conventional file system semantics,
but its proven useful in Plan 9 union mount semantics and it seems a
fairly trivial extension to what you currently have.

-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V9fs-developer] [PATCH 1/4] v9fs: rename non-vfs related structs and functions to be moved to net/9p

2007-05-08 Thread Eric Van Hensbergen

On 5/8/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Tue, 8 May 2007 14:51:02 -0600
"Latchesar Ionkov" <[EMAIL PROTECTED]> wrote:

> This patchset moves non-filesystem interfaces of v9fs from fs/9p to net/9p.
> It moves the transport, packet marshalling and connection layers to net/9p
> leaving only the VFS related files in fs/9p.

(Please cc [EMAIL PROTECTED] on net-related work)

These changes would be best handled via Eric's git tree, with appropriate
acks from the net maintainers.

Ack.  I was waiting to see if Lucho hit any major brick walls with the
community before pulling them in.

   -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V9fs-developer] [PATCH] 9p: create separate 9p client interface

2007-04-30 Thread Eric Van Hensbergen

On 4/30/07, Christoph Hellwig <[EMAIL PROTECTED]> wrote:

On Mon, Apr 30, 2007 at 09:32:41AM -0600, Latchesar Ionkov wrote:
> Create a separate 9P client interface that can be used outside the VFS
> layer. In addition to VFS, the new interface can be used to export the
> authentication channel or from other interfaces.

And what exact users would that be?  We have a huge dislike for putting
abstractions in just for the abstractions sake, so if you want this
merged you'd better present a highly useful client to that interface.

I'll let Lucho give more details on his possible uses for such an
interface -- for my part we have been looking at doing in-kernel
servers for more efficient export of devices and system services (such
as the network stack).  We've been using user space applications for
doing such sharing, but there are undesirable inefficiencies in such
an approach.

19 files changed, 1656 insertions(+), 1585 deletions(-)

I believe the log message was poorly worded, it was more of a
reorganization of the existing interface versus the creation of an
additional interface.

Also the non-filesystem interface code shouldn't live in fs/ but
rather in net/9p/

Which bits do you think are candidates for such a move?  The transport
interfaces? (there are a few more in the wings to cover shared memory
transports for VMMs among other things)  Should the protocol elements
move as well? -- that seems a bit fuzzier to me.

   -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/8] unprivileged mount syscall

2007-04-06 Thread Eric Van Hensbergen


On 4/6/07, H. Peter Anvin <[EMAIL PROTECTED]> wrote:

Jan Engelhardt wrote:
> On Apr 6 2007 16:16, H. Peter Anvin wrote:
 - users can use bind mounts without having to pre-configure them in
 /etc/fstab

>> This is by far the biggest concern I see.  I think the security implication 
of
>> allowing anyone to do bind mounts are poorly understood.
>
> $ whoami
> miklos
> $ mount --bind / ~/down_under
>
> later that day:
> # userdel -r miklos
>

Consider backups, for example.



This is the reason why enforcing private namespaces for user mounts
makes sense.  I think it catches many of these corner cases.

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p patches

2007-03-26 Thread Eric Van Hensbergen


Linus, please pull from the 'for-linus' branch of:
  git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git/ for-linus

This tree contains the following:

Adrian Bunk(1):
 make struct v9fs_cached_file_operations static

v9fs_vfs.h |1 -
vfs_file.c |4 +++-
2 files changed, 3 insertions(+), 2 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm patch] fs/9p/vfs_addr.c: make 2 functions static

2007-02-19 Thread Eric Van Hensbergen


On 2/19/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:

On Sat, Feb 17, 2007 at 09:51:46PM -0800, Andrew Morton wrote:
>...
> Changes since 2.6.20-mm1:
>...
>  git-v9fs.patch
>...
>  git trees
>...

This patch makes two needlessly global functions static.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>


Acked-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] 9p patches

2007-02-18 Thread Eric Van Hensbergen


Linus, please pull from the 'for-linus' branch of:
   git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git/ for-linus

This tree contains the following:

Eric Van Hensbergen(1):
   Implement optional loose read cache

Eric W. Biederman(1):
   Use kthread_strop instead of sending a SIGKILL.

Documentation/filesystems/00-INDEX |4 ++--
Documentation/filesystems/9p.txt   |4 
fs/9p/fid.c|3 ++-
fs/9p/mux.c|5 +
fs/9p/v9fs.c   |9 -
fs/9p/v9fs.h   |9 -
fs/9p/v9fs_vfs.h   |2 ++
fs/9p/vfs_addr.c   |2 ++
fs/9p/vfs_dentry.c |   26 ++
fs/9p/vfs_file.c   |   18 ++
fs/9p/vfs_inode.c  |   20 
11 files changed, 89 insertions(+), 13 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 9p: add write-cache support to loose cache mode (take 4)

2007-02-16 Thread Eric Van Hensbergen

On 2/16/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Fri, 16 Feb 2007 18:46:59 -0600
Eric Van Hensbergen <[EMAIL PROTECTED]> wrote:
> + if(!PageUptodate(page)) {
> + if (to - from != PAGE_CACHE_SIZE) {
> + void *kaddr = kmap_atomic(page, KM_USER0);
> + memset(kaddr, 0, from);
> + memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
> + flush_dcache_page(page);
> + kunmap_atomic(kaddr, KM_USER0);
> + }
> + if ((file->f_flags & O_ACCMODE) != O_WRONLY)
> + v9fs_vfs_readpage_worker(file, page);
> + }

Seems strange to memset part of the page and to then go and fill the page
in from backing store.  Perhaps some optimisation is possible here?

Just double-checking in an effort to actually get the next patch right
(hopefully) -- seems like there are two cases -- if I can read from
the file, I just call readpage and it'll zero out bits.  If the file
is open write-only, things are a little cloudy -- fs/cifs looks like
they just don't do anything.  In the write-only case, do I need to
zero the unwritten portions of the page, or does this get handled
under the covers?  Looks like NFS just avoids this by only writing the
bits that change, which I suppose has other advantages.  I'll refactor
the writepage code to follow the NFS example versus the CIFS code I
originally based my implementation on.

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] 9p: add write-cache support to loose cache mode (take 4)

2007-02-16 Thread Eric Van Hensbergen

Loose cache mode was added primarily to asssist exclusive, read-only
mounts (like venti) -- however, there is also a case for using loose
write cacheing in support of read/write exclusive mounts.  This feature
is linked to the loose cache option and is disabled by default.

This code adds the necessary code to support writes through the page
cache.  Write caches are not used for synthetic files or for files opened
in APPEND mode.

Signed-of-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 fs/9p/9p.h|2 +-
 fs/9p/conv.c  |   18 +-
 fs/9p/conv.h  |2 +-
 fs/9p/fcall.c |   10 ++-
 fs/9p/fid.c   |   16 +++---
 fs/9p/fid.h   |2 +-
 fs/9p/v9fs_vfs.h  |2 +
 fs/9p/vfs_addr.c  |  173 +---
 fs/9p/vfs_dir.c   |2 +-
 fs/9p/vfs_file.c  |   61 ++
 fs/9p/vfs_inode.c |   20 +-
 fs/9p/vfs_super.c |3 +-
 12 files changed, 265 insertions(+), 46 deletions(-)

diff --git a/fs/9p/9p.h b/fs/9p/9p.h
index 94e2f92..6f8edf0 100644
--- a/fs/9p/9p.h
+++ b/fs/9p/9p.h
@@ -370,6 +370,6 @@ int v9fs_t_read(struct v9fs_session_info *v9ses, u32 fid,
u64 offset, u32 count, struct v9fs_fcall **rcall);
 
 int v9fs_t_write(struct v9fs_session_info *v9ses, u32 fid, u64 offset,
-u32 count, const char __user * data,
+u32 count, const char __user * data, char * kdata,
 struct v9fs_fcall **rcall);
 int v9fs_printfcall(char *, int, struct v9fs_fcall *, int);
diff --git a/fs/9p/conv.c b/fs/9p/conv.c
index a3ed571..9bb075a 100644
--- a/fs/9p/conv.c
+++ b/fs/9p/conv.c
@@ -458,6 +458,15 @@ v9fs_put_user_data(struct cbuf *bufp, const char __user * 
data, int count,
return copy_from_user(*pdata, data, count);
 }
 
+static int
+v9fs_put_kernel_data(struct cbuf *bufp, char * kdata, int count,
+  unsigned char **pdata)
+{
+   *pdata = buf_alloc(bufp, count);
+   memcpy(*pdata, kdata, count);
+   return 0;
+}
+
 static void
 v9fs_put_wstat(struct cbuf *bufp, struct v9fs_wstat *wstat,
   struct v9fs_stat *stat, int statsz, int extended)
@@ -723,7 +732,7 @@ struct v9fs_fcall *v9fs_create_tread(u32 fid, u64 offset, 
u32 count)
 }
 
 struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, u32 count,
- const char __user * data)
+ const char __user * data, char * kdata)
 {
int size, err;
struct v9fs_fcall *fc;
@@ -738,7 +747,12 @@ struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, 
u32 count,
v9fs_put_int32(bufp, fid, &fc->params.twrite.fid);
v9fs_put_int64(bufp, offset, &fc->params.twrite.offset);
v9fs_put_int32(bufp, count, &fc->params.twrite.count);
-   err = v9fs_put_user_data(bufp, data, count, &fc->params.twrite.data);
+   if(data)
+   err = v9fs_put_user_data(bufp, data, count,
+   &fc->params.twrite.data);
+   else
+   err = v9fs_put_kernel_data(bufp, kdata, count,
+   &fc->params.twrite.data);
if (err) {
kfree(fc);
fc = ERR_PTR(err);
diff --git a/fs/9p/conv.h b/fs/9p/conv.h
index dd5b6b1..8091672 100644
--- a/fs/9p/conv.h
+++ b/fs/9p/conv.h
@@ -42,7 +42,7 @@ struct v9fs_fcall *v9fs_create_tcreate(u32 fid, char *name, 
u32 perm, u8 mode,
char *extension, int extended);
 struct v9fs_fcall *v9fs_create_tread(u32 fid, u64 offset, u32 count);
 struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, u32 count,
-   const char __user *data);
+   const char __user *data, char *kdata);
 struct v9fs_fcall *v9fs_create_tclunk(u32 fid);
 struct v9fs_fcall *v9fs_create_tremove(u32 fid);
 struct v9fs_fcall *v9fs_create_tstat(u32 fid);
diff --git a/fs/9p/fcall.c b/fs/9p/fcall.c
index dc336a6..ca77839 100644
--- a/fs/9p/fcall.c
+++ b/fs/9p/fcall.c
@@ -367,7 +367,7 @@ v9fs_t_read(struct v9fs_session_info *v9ses, u32 fid, u64 
offset,
int ret;
struct v9fs_fcall *tc, *rc;
 
-   dprintk(DEBUG_9P, "fid %d offset 0x%llux count 0x%x\n", fid,
+   dprintk(DEBUG_9P, "fid %d offset 0x%llx count 0x%x\n", fid,
(long long unsigned) offset, count);
 
tc = v9fs_create_tread(fid, offset, count);
@@ -393,21 +393,23 @@ v9fs_t_read(struct v9fs_session_info *v9ses, u32 fid, u64 
offset,
  * @fid: fid to write to
  * @offset: offset to start write at
  * @count: how many bytes to write
+ * @data: userspace data
+ * @kdata: kernelspace data
  * @fcall: pointer to response fcall
  *
  */
 
 int
 v9fs_t_write(struct v9fs_session_info *v9ses, u32 fid, u64 offset, u32 count,
-   const char __user *data, struct v9fs_fcall **rcp)
+   const char __user *data, char *kdata, struct v9fs_fcall **rcp)
 {
int ret;
struct v9fs_fcall *tc,

Re: [PATCH] 9p: add write-cache support to loose cache mode (take 3)

2007-02-16 Thread Eric Van Hensbergen


On 2/16/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Fri, 16 Feb 2007 09:37:01 -0600 Eric Van Hensbergen <[EMAIL PROTECTED]> 
wrote:

> +static int v9fs_vfs_writepage(struct page *page, struct writeback_control 
*wbc)
> +{
> + char *buffer = NULL;
> + struct address_space *mapping = page->mapping;
> + int retval = -EIO;
> + loff_t offset = 0;
> + pgoff_t end_index;
> + int count = PAGE_CACHE_SIZE;
> + struct file *filp = v9fs_find_file(page);
> + struct inode *inode = mapping->host;
> +
> + dprintk(DEBUG_VFS, "page: %p\n", page);
> +
> + if ((!inode) || (!filp))
> + goto UnlockPage;
> +
> + end_index = inode->i_size >> PAGE_CACHE_SHIFT;
> +
> + /* complicated case at end of file */
> + if (page->index >= end_index) {
> + /* things got complicated... */
> + count = inode->i_size & (PAGE_CACHE_SIZE - 1);
> + if (page->index >= end_index + 1 || !count)
> + return 0;   /* truncated - don't care */
> + }
> +
> + buffer = kmap(page);
> + offset = ((loff_t) page->index << PAGE_CACHE_SHIFT);
> + page_cache_get(page);
> + retval = v9fs_write(filp, NULL, buffer, count, &offset);
> +
>   kunmap(page);
> +
> +  UnlockPage:
>   unlock_page(page);
> + page_cache_release(page);
> +
>   return retval;
>  }

The page_cache_get/release here aren't needed: lock_page suffices.

Are you sure the page refcounting is right if the `goto UnlockPage'
happens?

Can that goto actually happen??



It shouldn't, if it does we are probably in big trouble anyways.  I'll
pull it out.  Thanks.

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] 9p: add write-cache support to loose cache mode (take 3)

2007-02-16 Thread Eric Van Hensbergen

Loose cache mode was added primarily to asssist exclusive, read-only
mounts (like venti) -- however, there is also a case for using loose
write cacheing in support of read/write exclusive mounts.  This feature
is linked to the loose cache option and is disabled by default.

This code adds the necessary code to support writes through the page
cache.  Write caches are not used for synthetic files or for files opened
in APPEND mode.

Signed-of-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 fs/9p/9p.h|2 +-
 fs/9p/conv.c  |   18 +-
 fs/9p/conv.h  |2 +-
 fs/9p/fcall.c |   10 ++-
 fs/9p/fid.c   |   16 +++---
 fs/9p/fid.h   |2 +-
 fs/9p/v9fs_vfs.h  |2 +
 fs/9p/vfs_addr.c  |  183 ++---
 fs/9p/vfs_dir.c   |2 +-
 fs/9p/vfs_file.c  |   61 ++
 fs/9p/vfs_inode.c |   20 +-
 fs/9p/vfs_super.c |3 +-
 12 files changed, 275 insertions(+), 46 deletions(-)

diff --git a/fs/9p/9p.h b/fs/9p/9p.h
index 94e2f92..6f8edf0 100644
--- a/fs/9p/9p.h
+++ b/fs/9p/9p.h
@@ -370,6 +370,6 @@ int v9fs_t_read(struct v9fs_session_info *v9ses, u32 fid,
u64 offset, u32 count, struct v9fs_fcall **rcall);
 
 int v9fs_t_write(struct v9fs_session_info *v9ses, u32 fid, u64 offset,
-u32 count, const char __user * data,
+u32 count, const char __user * data, char * kdata,
 struct v9fs_fcall **rcall);
 int v9fs_printfcall(char *, int, struct v9fs_fcall *, int);
diff --git a/fs/9p/conv.c b/fs/9p/conv.c
index a3ed571..9bb075a 100644
--- a/fs/9p/conv.c
+++ b/fs/9p/conv.c
@@ -458,6 +458,15 @@ v9fs_put_user_data(struct cbuf *bufp, const char __user * 
data, int count,
return copy_from_user(*pdata, data, count);
 }
 
+static int
+v9fs_put_kernel_data(struct cbuf *bufp, char * kdata, int count,
+  unsigned char **pdata)
+{
+   *pdata = buf_alloc(bufp, count);
+   memcpy(*pdata, kdata, count);
+   return 0;
+}
+
 static void
 v9fs_put_wstat(struct cbuf *bufp, struct v9fs_wstat *wstat,
   struct v9fs_stat *stat, int statsz, int extended)
@@ -723,7 +732,7 @@ struct v9fs_fcall *v9fs_create_tread(u32 fid, u64 offset, 
u32 count)
 }
 
 struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, u32 count,
- const char __user * data)
+ const char __user * data, char * kdata)
 {
int size, err;
struct v9fs_fcall *fc;
@@ -738,7 +747,12 @@ struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, 
u32 count,
v9fs_put_int32(bufp, fid, &fc->params.twrite.fid);
v9fs_put_int64(bufp, offset, &fc->params.twrite.offset);
v9fs_put_int32(bufp, count, &fc->params.twrite.count);
-   err = v9fs_put_user_data(bufp, data, count, &fc->params.twrite.data);
+   if(data)
+   err = v9fs_put_user_data(bufp, data, count,
+   &fc->params.twrite.data);
+   else
+   err = v9fs_put_kernel_data(bufp, kdata, count,
+   &fc->params.twrite.data);
if (err) {
kfree(fc);
fc = ERR_PTR(err);
diff --git a/fs/9p/conv.h b/fs/9p/conv.h
index dd5b6b1..8091672 100644
--- a/fs/9p/conv.h
+++ b/fs/9p/conv.h
@@ -42,7 +42,7 @@ struct v9fs_fcall *v9fs_create_tcreate(u32 fid, char *name, 
u32 perm, u8 mode,
char *extension, int extended);
 struct v9fs_fcall *v9fs_create_tread(u32 fid, u64 offset, u32 count);
 struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, u32 count,
-   const char __user *data);
+   const char __user *data, char *kdata);
 struct v9fs_fcall *v9fs_create_tclunk(u32 fid);
 struct v9fs_fcall *v9fs_create_tremove(u32 fid);
 struct v9fs_fcall *v9fs_create_tstat(u32 fid);
diff --git a/fs/9p/fcall.c b/fs/9p/fcall.c
index dc336a6..ca77839 100644
--- a/fs/9p/fcall.c
+++ b/fs/9p/fcall.c
@@ -367,7 +367,7 @@ v9fs_t_read(struct v9fs_session_info *v9ses, u32 fid, u64 
offset,
int ret;
struct v9fs_fcall *tc, *rc;
 
-   dprintk(DEBUG_9P, "fid %d offset 0x%llux count 0x%x\n", fid,
+   dprintk(DEBUG_9P, "fid %d offset 0x%llx count 0x%x\n", fid,
(long long unsigned) offset, count);
 
tc = v9fs_create_tread(fid, offset, count);
@@ -393,21 +393,23 @@ v9fs_t_read(struct v9fs_session_info *v9ses, u32 fid, u64 
offset,
  * @fid: fid to write to
  * @offset: offset to start write at
  * @count: how many bytes to write
+ * @data: userspace data
+ * @kdata: kernelspace data
  * @fcall: pointer to response fcall
  *
  */
 
 int
 v9fs_t_write(struct v9fs_session_info *v9ses, u32 fid, u64 offset, u32 count,
-   const char __user *data, struct v9fs_fcall **rcp)
+   const char __user *data, char *kdata, struct v9fs_fcall **rcp)
 {
int ret;
struct v9fs_fcall *tc,

[PATCH] 9p: add write-cache support to loose cache mode (take 2)

2007-02-15 Thread Eric Van Hensbergen

Loose cache mode was added primarily to asssist exclusive, read-only
mounts (like venti) -- however, there is also a case for using loose
write cacheing in support of read/write exclusive mounts.  This feature
is linked to the loose cache option and is disabled by default.

This code adds the necessary code to support writes through the page
cache.  Write caches are not used for synthetic files or for files opened
in APPEND mode.

Signed-of-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 fs/9p/9p.h|2 +-
 fs/9p/conv.c  |   19 +-
 fs/9p/conv.h  |2 +-
 fs/9p/fcall.c |   10 ++-
 fs/9p/fid.c   |   17 +++--
 fs/9p/fid.h   |2 +-
 fs/9p/v9fs_vfs.h  |3 +
 fs/9p/vfs_addr.c  |  189 +---
 fs/9p/vfs_dir.c   |2 +-
 fs/9p/vfs_file.c  |   63 ++
 fs/9p/vfs_inode.c |   24 +--
 fs/9p/vfs_super.c |3 +-
 12 files changed, 287 insertions(+), 49 deletions(-)

diff --git a/fs/9p/9p.h b/fs/9p/9p.h
index 94e2f92..6f8edf0 100644
--- a/fs/9p/9p.h
+++ b/fs/9p/9p.h
@@ -370,6 +370,6 @@ int v9fs_t_read(struct v9fs_session_info *v9ses, u32 fid,
u64 offset, u32 count, struct v9fs_fcall **rcall);
 
 int v9fs_t_write(struct v9fs_session_info *v9ses, u32 fid, u64 offset,
-u32 count, const char __user * data,
+u32 count, const char __user * data, char * kdata,
 struct v9fs_fcall **rcall);
 int v9fs_printfcall(char *, int, struct v9fs_fcall *, int);
diff --git a/fs/9p/conv.c b/fs/9p/conv.c
index a3ed571..8d90c79 100644
--- a/fs/9p/conv.c
+++ b/fs/9p/conv.c
@@ -458,6 +458,15 @@ v9fs_put_user_data(struct cbuf *bufp, const char __user * 
data, int count,
return copy_from_user(*pdata, data, count);
 }
 
+static int
+v9fs_put_kernel_data(struct cbuf *bufp, char * kdata, int count,
+  unsigned char **pdata)
+{
+   *pdata = buf_alloc(bufp, count);
+   memcpy(*pdata, kdata, count);
+   return 0;
+}
+
 static void
 v9fs_put_wstat(struct cbuf *bufp, struct v9fs_wstat *wstat,
   struct v9fs_stat *stat, int statsz, int extended)
@@ -723,7 +732,7 @@ struct v9fs_fcall *v9fs_create_tread(u32 fid, u64 offset, 
u32 count)
 }
 
 struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, u32 count,
- const char __user * data)
+ const char __user * data, char * kdata)
 {
int size, err;
struct v9fs_fcall *fc;
@@ -738,7 +747,13 @@ struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, 
u32 count,
v9fs_put_int32(bufp, fid, &fc->params.twrite.fid);
v9fs_put_int64(bufp, offset, &fc->params.twrite.offset);
v9fs_put_int32(bufp, count, &fc->params.twrite.count);
-   err = v9fs_put_user_data(bufp, data, count, &fc->params.twrite.data);
+   if(data) {
+   err = v9fs_put_user_data(bufp, data, count,
+   &fc->params.twrite.data);
+   } else {
+   err = v9fs_put_kernel_data(bufp, kdata, count,
+   &fc->params.twrite.data);
+   }
if (err) {
kfree(fc);
fc = ERR_PTR(err);
diff --git a/fs/9p/conv.h b/fs/9p/conv.h
index dd5b6b1..8091672 100644
--- a/fs/9p/conv.h
+++ b/fs/9p/conv.h
@@ -42,7 +42,7 @@ struct v9fs_fcall *v9fs_create_tcreate(u32 fid, char *name, 
u32 perm, u8 mode,
char *extension, int extended);
 struct v9fs_fcall *v9fs_create_tread(u32 fid, u64 offset, u32 count);
 struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, u32 count,
-   const char __user *data);
+   const char __user *data, char *kdata);
 struct v9fs_fcall *v9fs_create_tclunk(u32 fid);
 struct v9fs_fcall *v9fs_create_tremove(u32 fid);
 struct v9fs_fcall *v9fs_create_tstat(u32 fid);
diff --git a/fs/9p/fcall.c b/fs/9p/fcall.c
index dc336a6..ca77839 100644
--- a/fs/9p/fcall.c
+++ b/fs/9p/fcall.c
@@ -367,7 +367,7 @@ v9fs_t_read(struct v9fs_session_info *v9ses, u32 fid, u64 
offset,
int ret;
struct v9fs_fcall *tc, *rc;
 
-   dprintk(DEBUG_9P, "fid %d offset 0x%llux count 0x%x\n", fid,
+   dprintk(DEBUG_9P, "fid %d offset 0x%llx count 0x%x\n", fid,
(long long unsigned) offset, count);
 
tc = v9fs_create_tread(fid, offset, count);
@@ -393,21 +393,23 @@ v9fs_t_read(struct v9fs_session_info *v9ses, u32 fid, u64 
offset,
  * @fid: fid to write to
  * @offset: offset to start write at
  * @count: how many bytes to write
+ * @data: userspace data
+ * @kdata: kernelspace data
  * @fcall: pointer to response fcall
  *
  */
 
 int
 v9fs_t_write(struct v9fs_session_info *v9ses, u32 fid, u64 offset, u32 count,
-   const char __user *data, struct v9fs_fcall **rcp)
+   const char __user *data, char *kdata, struct v9fs_fcall **rcp)
 {
int ret;
struct v9f

Re: [RESEND][PATCH] 9p: add write-cache support to loose cache mode

2007-02-13 Thread Eric Van Hensbergen

On 2/13/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Tue, 13 Feb 2007 17:55:31 -0600 Eric Van Hensbergen <[EMAIL PROTECTED]> 
wrote:
> +int v9fs_prepare_write(struct file *file, struct page *page,
> +unsigned from, unsigned to)
> +{
> + if (!PageUptodate(page)) {
> + if (to - from != PAGE_CACHE_SIZE) {
> + void *kaddr = kmap_atomic(page, KM_USER0);
> + memset(kaddr, 0, from);
> + memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
> + flush_dcache_page(page);
> + kunmap_atomic(kaddr, KM_USER0);
> + }
> + SetPageUptodate(page);
> + }

This will mark the page uptodate while the piece between `to' and `from' is
uninitialised.  A concurrent pagefault can come in and permit a read of
that uninitialised data.  Because filemap_nopage() doesn't lock the page if
it is uptodate.

Okay - I snagged this code from fs/libfs.c (simple_prepare_write) --
is that code also not correct, or am I just using it in the wrong
context?

   -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND][PATCH] 9p: add write-cache support to loose cache mode

2007-02-13 Thread Eric Van Hensbergen

Loose cache mode was added primarily to asssist exclusive, read-only
mounts (like venti) -- however, there is also a case for using loose
write cacheing in support of read/write exclusive mounts.  This feature
is linked to the loose cache option and is disabled by default.

This code adds the necessary code to support writes through the page
cache.  Write caches are not used for synthetic files or for files opened
in APPEND mode.

Signed-of-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 Documentation/filesystems/9p.txt |2 
 fs/9p/9p.h   |2 
 fs/9p/conv.c |   19 
 fs/9p/conv.h |2 
 fs/9p/fcall.c|6 +
 fs/9p/fid.c  |   17 ++--
 fs/9p/fid.h  |2 
 fs/9p/v9fs_vfs.h |2 
 fs/9p/vfs_addr.c |  180 --
 fs/9p/vfs_dir.c  |2 
 fs/9p/vfs_file.c |   58 ++--
 fs/9p/vfs_inode.c|   13 ++-
 fs/9p/vfs_super.c|3 -
 13 files changed, 264 insertions(+), 44 deletions(-)

diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index bbd8b28..36ed211 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -42,7 +42,7 @@ OPTIONS
 
   cache=mode   specifies a cacheing policy.  By default, no caches are used.
loose = no attempts are made at consistency,
-intended for exclusive, read-only mounts
+intended for exclusive mounts
 
   debug=n  specifies debug level.  The debug level is a bitmask.
0x01 = display verbose error messages
diff --git a/fs/9p/9p.h b/fs/9p/9p.h
index 94e2f92..6f8edf0 100644
--- a/fs/9p/9p.h
+++ b/fs/9p/9p.h
@@ -370,6 +370,6 @@ int v9fs_t_read(struct v9fs_session_info
u64 offset, u32 count, struct v9fs_fcall **rcall);
 
 int v9fs_t_write(struct v9fs_session_info *v9ses, u32 fid, u64 offset,
-u32 count, const char __user * data,
+u32 count, const char __user * data, char * kdata,
 struct v9fs_fcall **rcall);
 int v9fs_printfcall(char *, int, struct v9fs_fcall *, int);
diff --git a/fs/9p/conv.c b/fs/9p/conv.c
index a3ed571..89c3d3c 100644
--- a/fs/9p/conv.c
+++ b/fs/9p/conv.c
@@ -458,6 +458,15 @@ v9fs_put_user_data(struct cbuf *bufp, co
return copy_from_user(*pdata, data, count);
 }
 
+static int
+v9fs_put_kernel_data(struct cbuf *bufp, char * kdata, int count,
+  unsigned char **pdata)
+{
+   *pdata = buf_alloc(bufp, count);
+   memcpy(*pdata, kdata, count);
+   return 0;
+}
+
 static void
 v9fs_put_wstat(struct cbuf *bufp, struct v9fs_wstat *wstat,
   struct v9fs_stat *stat, int statsz, int extended)
@@ -723,7 +732,7 @@ struct v9fs_fcall *v9fs_create_tread(u32
 }
 
 struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, u32 count,
- const char __user * data)
+ const char __user * data, char * kdata)
 {
int size, err;
struct v9fs_fcall *fc;
@@ -738,7 +747,13 @@ struct v9fs_fcall *v9fs_create_twrite(u3
v9fs_put_int32(bufp, fid, &fc->params.twrite.fid);
v9fs_put_int64(bufp, offset, &fc->params.twrite.offset);
v9fs_put_int32(bufp, count, &fc->params.twrite.count);
-   err = v9fs_put_user_data(bufp, data, count, &fc->params.twrite.data);
+   if(data) {
+   err = v9fs_put_user_data(bufp, data, count, 
+   &fc->params.twrite.data); 
+   } else {
+   err = v9fs_put_kernel_data(bufp, kdata, count, 
+   &fc->params.twrite.data); 
+   }   
if (err) {
kfree(fc);
fc = ERR_PTR(err);
diff --git a/fs/9p/conv.h b/fs/9p/conv.h
index dd5b6b1..8091672 100644
--- a/fs/9p/conv.h
+++ b/fs/9p/conv.h
@@ -42,7 +42,7 @@ struct v9fs_fcall *v9fs_create_tcreate(u
char *extension, int extended);
 struct v9fs_fcall *v9fs_create_tread(u32 fid, u64 offset, u32 count);
 struct v9fs_fcall *v9fs_create_twrite(u32 fid, u64 offset, u32 count,
-   const char __user *data);
+   const char __user *data, char *kdata);
 struct v9fs_fcall *v9fs_create_tclunk(u32 fid);
 struct v9fs_fcall *v9fs_create_tremove(u32 fid);
 struct v9fs_fcall *v9fs_create_tstat(u32 fid);
diff --git a/fs/9p/fcall.c b/fs/9p/fcall.c
index dc336a6..671301a 100644
--- a/fs/9p/fcall.c
+++ b/fs/9p/fcall.c
@@ -393,13 +393,15 @@ v9fs_t_read(struct v9fs_session_info *v9
  * @fid: fid to write to
  * @offset: offset to start write at
  * @count: how many bytes to write
+ * @data: userspace data 
+ * @kdata: kernelspace data
  * @fcall: pointer to response fcall
  *
  */
 
 i

[PATCH] 9p: add write-cache support to loose cache mode

2007-02-13 Thread Eric Van Hensbergen

Loose cache mode was added primarily to asssist exclusive, read-only
mounts (like venti) -- however, there is also a case for using loose
write cacheing in support of read/write exclusive mounts.  This feature
is linked to the loose cache option and is disabled by default.

This code adds the necessary code to support writes through the page
cache.  Write caches are not used for synthetic files or for files opened
in APPEND mode.

Signed-of-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 Documentation/filesystems/9p.txt |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index bbd8b28..36ed211 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -42,7 +42,7 @@ OPTIONS
 
   cache=mode   specifies a cacheing policy.  By default, no caches are used.
loose = no attempts are made at consistency,
-intended for exclusive, read-only mounts
+intended for exclusive mounts
 
   debug=n  specifies debug level.  The debug level is a bitmask.
0x01 = display verbose error messages
-- 
1.4.1

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] 9p: implement optional loose read cache

2007-02-13 Thread Eric Van Hensbergen

While cacheing is generally frowned upon in the 9p world, it has its
place -- particularly in situations where the remote file system is
exclusive and/or read-only.  The vacfs views of venti content addressable
store are a real-world instance of such a situation.  To facilitate higher
performance for these workloads (and eventually use the fscache patches),
we have enabled a "loose" cache mode which does not attempt to maintain
any form of consistency on the page-cache or dcache.  This results in over
two orders of magnitude performance improvement for cacheable block reads
in the Bonnie benchmark.  The more aggressive use of the dcache also seems
to improve metadata operational performance.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 Documentation/filesystems/00-INDEX |4 ++--
 Documentation/filesystems/9p.txt   |4 
 fs/9p/fid.c|3 ++-
 fs/9p/v9fs.c   |9 -
 fs/9p/v9fs.h   |9 -
 fs/9p/v9fs_vfs.h   |2 ++
 fs/9p/vfs_addr.c   |2 ++
 fs/9p/vfs_dentry.c |   26 ++
 fs/9p/vfs_file.c   |   18 ++
 fs/9p/vfs_inode.c  |   20 
 10 files changed, 88 insertions(+), 9 deletions(-)

diff --git a/Documentation/filesystems/00-INDEX 
b/Documentation/filesystems/00-INDEX
index 4dc28cc..5717858 100644
--- a/Documentation/filesystems/00-INDEX
+++ b/Documentation/filesystems/00-INDEX
@@ -4,6 +4,8 @@ Exporting
- explanation of how to make filesystems exportable.
 Locking
- info on locking rules as they pertain to Linux VFS.
+9p.txt
+   - 9p (v9fs) is an implementation of the Plan 9 remote fs protocol.
 adfs.txt
- info and mount options for the Acorn Advanced Disc Filing System.
 afs.txt
@@ -82,8 +84,6 @@ udf.txt
- info and mount options for the UDF filesystem.
 ufs.txt
- info on the ufs filesystem.
-v9fs.txt
-   - v9fs is a Unix implementation of the Plan 9 9p remote fs protocol.
 vfat.txt
- info on using the VFAT filesystem used in Windows NT and Windows 95
 vfs.txt
diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index 4d075a4..bbd8b28 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -40,6 +40,10 @@ OPTIONS
   aname=name   aname specifies the file tree to access when the server is
offering several exported file systems.
 
+  cache=mode   specifies a cacheing policy.  By default, no caches are used.
+   loose = no attempts are made at consistency,
+intended for exclusive, read-only mounts
+
   debug=n  specifies debug level.  The debug level is a bitmask.
0x01 = display verbose error messages
0x02 = developer debug (DEBUG_CURRENT)
diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index a9b6301..9041971 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -136,7 +136,8 @@ struct v9fs_fid *v9fs_fid_lookup(struct 
 }
 
 /**
- * v9fs_fid_clone - lookup the fid for a dentry, clone a private copy and 
release it
+ * v9fs_fid_clone - lookup the fid for a dentry, clone a private copy and
+ * release it
  * @dentry: dentry to look for fid in
  *
  * find a fid in the dentry and then clone to a new private fid
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index d9b561b..6ad6f19 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -53,6 +53,8 @@ enum {
Opt_uname, Opt_remotename,
/* Options that take no arguments */
Opt_legacy, Opt_nodevmap, Opt_unix, Opt_tcp, Opt_fd,
+   /* Cache options */
+   Opt_cache_loose,
/* Error token */
Opt_err
 };
@@ -76,6 +78,8 @@ static match_table_t tokens = {
{Opt_fd, "fd"},
{Opt_legacy, "noextend"},
{Opt_nodevmap, "nodevmap"},
+   {Opt_cache_loose, "cache=loose"},
+   {Opt_cache_loose, "loose"},
{Opt_err, NULL}
 };
 
@@ -106,6 +110,7 @@ static void v9fs_parse_options(char *opt
v9ses->debug = 0;
v9ses->rfdno = ~0;
v9ses->wfdno = ~0;
+   v9ses->cache = 0;
 
if (!options)
return;
@@ -121,7 +126,6 @@ static void v9fs_parse_options(char *opt
"integer field, but no integer?\n");
continue;
}
-
}
switch (token) {
case Opt_port:
@@ -169,6 +173,9 @@ static void v9fs_parse_options(char *opt
case Opt_nodevmap:
v9ses->nodev = 1;
break;
+   case Opt_cache_loose:
+   v9ses->cache = CACHE_LOOSE;
+   break;
default:

Re: [RFC][PATCH] dm-cow: copy-on-write stackable target for device-mapper

2007-01-26 Thread Eric Van Hensbergen


On 11/27/06, Eric Van Hensbergen <[EMAIL PROTECTED]> wrote:

Subject: [RFC] [PATCH] dm-cow: copy-on-write stackable target for device-mapper

+
+A simple script file is included to format WB.
+The way it works is the standard dmsetup calls for
+the device mapper. The arguments are
+


There have been several requests for the "simple" script file.
Portions of its implementation are somewhat specific to our
installation (like the use of the rc shell).  I'm including it inline
here - not sure where else to put it -- perhaps in the Documentation:

#!/usr/bin/rc

tableboth ='0 devblksz cow devread 0 devwrite devbmapw 0'
devsize = `{sfdisk -s $1}

fn wipeoutdev{
   bmapsize = `{echo $devsize / 512 /2 |bc}
   echo dd 'if=/dev/zero' 'of='^$1 'count='^$bmapsize 'bs=512'
   dd 'if=/dev/zero' 'of='^$1 'count='^$bmapsize 'bs=512'
}

fn escapedev{
   echo $1|sed 's/\//\\\//g'
}


fn setcow{
   devread = `{escapedev $1}
   devwrite = `{escapedev $2}
   devbmapw = `{escapedev $3}
   tableset = `{echo $tableboth|sed -r 's/devread/'^$devread^'/'}
   tableset = `{echo $tableset|sed -r 's/devwrite/'^$devwrite^'/'}
   tableset = `{echo $tableset|sed -r 's/devbmapw/'^$devbmapw^'/'}
   devblksz = `{expr $devsize '*' 2}
   tableset = `{echo $tableset|sed -r 's/devblksz/'^$devblksz^'/'}
   echo trying to set $tableset
   echo $tableset > /tmp/tableset
   echo trying to dmsetup create $4
   dmsetup create $4 << EOF
$tableset
EOF
   dmsetup table
}

if(! ~ $#* 4){
   echo usage: newcow origdev destdev bmapdest namecow
   exit 1
}

namecow=$4
origdev=$1
wipeoutdev $2
setcow $1 $2 $3 $namecow
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] 9p: null terminate error strings for debug print

2007-01-22 Thread Eric Van Hensbergen

From: Eric Van Hensbergen <[EMAIL PROTECTED]> - unquoted

We weren't properly NULL terminating protocol error strings for our
debug printk resulting in garbage being included in the output when debug
was enabled.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 fs/9p/error.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/9p/error.c b/fs/9p/error.c
index ae91555..0d7fa4e 100644
--- a/fs/9p/error.c
+++ b/fs/9p/error.c
@@ -83,6 +83,7 @@ int v9fs_errstr2errno(char *errstr, int len)
 
if (errno == 0) {
/* TODO: if error isn't found, add it dynamically */
+   errstr[len] = 0;
printk(KERN_ERR "%s: errstr :%s: not found\n", __FUNCTION__,
   errstr);
errno = 1;
-- 
1.5.0.rc1.gde38

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] 9p: fix segfault caused by race condition in meta-data operations

2007-01-22 Thread Eric Van Hensbergen

From: Eric Van Hensbergen <[EMAIL PROTECTED]> - unquoted

Running dbench multithreaded exposed a race condition where fid structures
were removed while in use.  This patch adds semaphores to meta-data operations
to protect the fid structure.  Some cleanup of error-case handling in the
inode operations is also included.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 fs/9p/fid.c   |   69 +-
 fs/9p/fid.h   |5 ++
 fs/9p/vfs_file.c  |   47 ++--
 fs/9p/vfs_inode.c |  204 ++--
 4 files changed, 196 insertions(+), 129 deletions(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 2750720..a9b6301 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "debug.h"
 #include "v9fs.h"
@@ -84,6 +85,7 @@ struct v9fs_fid *v9fs_fid_create(struct v9fs_session_info 
*v9ses, int fid)
new->iounit = 0;
new->rdir_pos = 0;
new->rdir_fcall = NULL;
+   init_MUTEX(&new->lock);
INIT_LIST_HEAD(&new->list);
 
return new;
@@ -102,11 +104,11 @@ void v9fs_fid_destroy(struct v9fs_fid *fid)
 }
 
 /**
- * v9fs_fid_lookup - retrieve the right fid from a  particular dentry
+ * v9fs_fid_lookup - return a locked fid from a dentry
  * @dentry: dentry to look for fid in
- * @type: intent of lookup (operation or traversal)
  *
- * find a fid in the dentry
+ * find a fid in the dentry, obtain its semaphore and return a reference to it.
+ * code calling lookup is responsible for releasing lock
  *
  * TODO: only match fids that have the same uid as current user
  *
@@ -124,7 +126,68 @@ struct v9fs_fid *v9fs_fid_lookup(struct dentry *dentry)
 
if (!return_fid) {
dprintk(DEBUG_ERROR, "Couldn't find a fid in dentry\n");
+   return_fid = ERR_PTR(-EBADF);
}
 
+   if(down_interruptible(&return_fid->lock))
+   return ERR_PTR(-EINTR);
+
return return_fid;
 }
+
+/**
+ * v9fs_fid_clone - lookup the fid for a dentry, clone a private copy and 
release it
+ * @dentry: dentry to look for fid in
+ *
+ * find a fid in the dentry and then clone to a new private fid
+ *
+ * TODO: only match fids that have the same uid as current user
+ *
+ */
+
+struct v9fs_fid *v9fs_fid_clone(struct dentry *dentry)
+{
+   struct v9fs_session_info *v9ses = v9fs_inode2v9ses(dentry->d_inode);
+   struct v9fs_fid *base_fid, *new_fid = ERR_PTR(-EBADF);
+   struct v9fs_fcall *fcall = NULL;
+   int fid, err;
+
+   base_fid = v9fs_fid_lookup(dentry);
+
+   if(IS_ERR(base_fid))
+   return base_fid;
+
+   if(base_fid) {  /* clone fid */
+   fid = v9fs_get_idpool(&v9ses->fidpool);
+   if (fid < 0) {
+   eprintk(KERN_WARNING, "newfid fails!\n");
+   new_fid = ERR_PTR(-ENOSPC);
+   goto Release_Fid;
+   }
+
+   err = v9fs_t_walk(v9ses, base_fid->fid, fid, NULL, &fcall);
+   if (err < 0) {
+   dprintk(DEBUG_ERROR, "clone walk didn't work\n");
+   v9fs_put_idpool(fid, &v9ses->fidpool);
+   new_fid = ERR_PTR(err);
+   goto Free_Fcall;
+   }
+   new_fid = v9fs_fid_create(v9ses, fid);
+   if (new_fid == NULL) {
+   dprintk(DEBUG_ERROR, "out of memory\n");
+   new_fid = ERR_PTR(-ENOMEM);
+   }
+Free_Fcall:
+   kfree(fcall);
+   }
+
+Release_Fid:
+   up(&base_fid->lock);
+   return new_fid;
+}
+
+void v9fs_fid_clunk(struct v9fs_session_info *v9ses, struct v9fs_fid *fid)
+{
+   v9fs_t_clunk(v9ses, fid->fid);
+   v9fs_fid_destroy(fid);
+}
diff --git a/fs/9p/fid.h b/fs/9p/fid.h
index aa974d6..48fc170 100644
--- a/fs/9p/fid.h
+++ b/fs/9p/fid.h
@@ -30,6 +30,8 @@ struct v9fs_fid {
struct list_head list;   /* list of fids associated with a dentry */
struct list_head active; /* XXX - debug */
 
+   struct semaphore lock;
+
u32 fid;
unsigned char fidopen;/* set when fid is opened */
unsigned char fidclunked; /* set when fid has already been clunked */
@@ -55,3 +57,6 @@ struct v9fs_fid *v9fs_fid_get_created(struct dentry *);
 void v9fs_fid_destroy(struct v9fs_fid *fid);
 struct v9fs_fid *v9fs_fid_create(struct v9fs_session_info *, int fid);
 int v9fs_fid_insert(struct v9fs_fid *fid, struct dentry *dentry);
+struct v9fs_fid *v9fs_fid_clone(struct dentry *dentry);
+void v9fs_fid_clunk(struct v9fs_session_info *v9ses, struct v9fs_fid *fid);
+
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index e86a071..9f17b0c 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -55,53 +55,22 @@ int v9fs_file

[PATCH] 9p: update documentation regarding server applications

2007-01-18 Thread Eric Van Hensbergen

Update the documentation to cover using Inferno as a server for 9p and to
include information about spfs (a stable single-threaded stand-alone 9p
server).

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 Documentation/filesystems/9p.txt |   20 +---
 1 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index 43b89c2..be45fe3 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -73,8 +73,22 @@ OPTIONS
 RESOURCES
 =
 
-The Linux version of the 9p server is now maintained under the npfs project
-on sourceforge (http://sourceforge.net/projects/npfs).
+Our current recommendation is to use Inferno (http://www.vitanuova.com/inferno)
+as the 9p server.  You can start a 9p server under Inferno by issuing the 
+following command:
+   ; styxlisten -A tcp!*!564 export '#U*'
+
+The -A specifies an unauthenticated export.  The 564 is the port # (you may
+have to choose a higher port number if running as a normal user).  The '#U*'
+specifies exporting the root of the Linux name space.  You may specify a 
+subset of the namespace by extending the path: '#U*'/tmp would just export
+/tmp.  For more information, see the Inferno manual pages covering styxlisten
+and export.
+
+A Linux version of the 9p server is now maintained under the npfs project
+on sourceforge (http://sourceforge.net/projects/npfs).  There is also a 
+more stable single-threaded version of the server (named spfs) available from 
+the same CVS repository.
 
 There are user and developer mailing lists available through the v9fs project
 on sourceforge (http://sourceforge.net/projects/v9fs).
@@ -96,5 +110,5 @@ STATUS
 
 The 2.6 kernel support is working on PPC and x86.
 
-PLEASE USE THE SOURCEFORGE BUG-TRACKER TO REPORT PROBLEMS.
+PLEASE USE THE KERNEL BUGZILLA TO REPORT PROBLEMS. (http://bugzilla.kernel.org)
 
-- 
1.5.0.rc1.gdf1b-dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND][PATCH] 9p: fix rename return code

2007-01-18 Thread Eric Van Hensbergen

9p doesn't handle renames between directories -- however, we were returning
EPERM instead of EXDEV when we detected this case.

Signed-off-by: Eric Van Hensbergren <[EMAIL PROTECTED]>
---
 fs/9p/vfs_inode.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 18f26cd..05d30e8 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -767,7 +767,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry 
*old_dentry,
/* 9P can only handle file rename in the same directory */
if (memcmp(&olddirfid->qid, &newdirfid->qid, sizeof(newdirfid->qid))) {
dprintk(DEBUG_ERROR, "old dir and new dir are different\n");
-   retval = -EPERM;
+   retval = -EXDEV;
goto FreeFcallnBail;
}
 
-- 
1.5.0.rc1.gdf1b-dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND][PATCH] 9p: fix bogus return code checks during initialization

2007-01-18 Thread Eric Van Hensbergen

There is a simple logic error in init_v9fs - the return code checks are
reversed.  This patch fixes the return code and adds some messages to
prevent module initialization from failing silently.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 fs/9p/mux.c  |4 +++-
 fs/9p/v9fs.c |   11 ---
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/9p/mux.c b/fs/9p/mux.c
index 944273c..147ceef 100644
--- a/fs/9p/mux.c
+++ b/fs/9p/mux.c
@@ -132,8 +132,10 @@ int v9fs_mux_global_init(void)
v9fs_mux_poll_tasks[i].task = NULL;
 
v9fs_mux_wq = create_workqueue("v9fs");
-   if (!v9fs_mux_wq)
+   if (!v9fs_mux_wq) {
+   printk(KERN_WARNING "v9fs: mux: creating workqueue failed\n");
return -ENOMEM;
+   }
 
return 0;
 }
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index 0b96fae..d9b561b 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -457,14 +457,19 @@ static int __init init_v9fs(void)
 
v9fs_error_init();
 
-   printk(KERN_INFO "Installing v9fs 9P2000 file system support\n");
+   printk(KERN_INFO "Installing v9fs 9p2000 file system support\n");
 
ret = v9fs_mux_global_init();
-   if (!ret)
+   if (ret) {
+   printk(KERN_WARNING "v9fs: starting mux failed\n");
return ret;
+   }
ret = register_filesystem(&v9fs_fs_type);
-   if (!ret)
+   if (ret) {
+   printk(KERN_WARNING "v9fs: registering file system failed\n");
v9fs_mux_global_exit();
+   }
+
return ret;
 }
 
-- 
1.5.0.rc1.gdf1b-dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] dm-cache: block level disk cache target for device mapper

2006-12-11 Thread Eric Van Hensbergen


On 11/27/06, Eric Van Hensbergen <[EMAIL PROTECTED]> wrote:

This is the first cut of a device-mapper target which provides a write-back
or write-through block cache.  It is intended to be used in conjunction with
remote block devices such as iSCSI or ATA-over-Ethernet, particularly in
cluster situations.



The technical paper describing our motivations and some performance
results has finally made it through IBM's clearance process.  It is
available here:

http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/ba52bef8b940e7438525723c006bafea?OpenDocument

  -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] dm-cow: copy-on-write stackable target for device-mapper

2006-12-11 Thread Eric Van Hensbergen


On 11/27/06, Eric Van Hensbergen <[EMAIL PROTECTED]> wrote:

Subject: [RFC] [PATCH] dm-cow: copy-on-write stackable target for device-mapper

This is the first cut of a device-mapper target which allows stacking of
multiple block devices and in which the top-layer of the stack is a
copy-on-write layer.  It was originally developed in support of a cluster
image management solution.



The paper describing our motivation for this work including some
description of this implementation and performance results is now
available:

http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/801d563d3be022198525723c006fafc1?OpenDocument

  -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [dm-devel] [RFC][PATCH] dm-cache: block level disk cache target for device mapper

2006-11-30 Thread Eric Van Hensbergen


On 11/30/06, Jens Wilke <[EMAIL PROTECTED]> wrote:

On Monday 27 November 2006 19:26, Eric Van Hensbergen wrote:

If this is intended to speed up remote disks, is it possible that the cache 
content
can be paged out on local disks in low-mem situations?



The main intent was to use local disks as cache to offload centralized
remote disks.  The logic was that most systems have local disks, if
only for swap -- so why not use them as a cache to help offload
centralized storage.  While the in-memory page cache works perfectly
fine in certain situations -- we were dealing with workloads in which
the in-memory page-cache wasn't sufficient to hold all the data.

There are also some additional possibilities we've thought through and
have been playing with including allowing the local disk cache to be
persistent across reboots (with varying validation schemes).

-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] dm-cache: block level disk cache target for device mapper

2006-11-27 Thread Eric Van Hensbergen

On 11/27/06, bert hubert <[EMAIL PROTECTED]> wrote:

On Mon, Nov 27, 2006 at 06:26:34PM +, Eric Van Hensbergen wrote:
> This is the first cut of a device-mapper target which provides a write-back
> or write-through block cache.  It is intended to be used in conjunction with
> remote block devices such as iSCSI or ATA-over-Ethernet, particularly in
> cluster situations.

How does this work in practice? In other words, what is a typical actual
configuration?

There is a remote block device, and a local one, and these are kept into
sync in some way?

That's the basic idea.  In our testbed, we had a single iSCSI server
exporting block devices to several clients -- each maintaining their
own local disk cache of the server exported block devices.  You can
configured either write-through or write-back policies -- write-back
has better performance, but somewhat obvious consistency issues in
failure cases.

The original intent was to combine this with the dm-cow target (which
I posted a few hours before the dm-cache patch) to provide a scalable
cluster deployment system based on back-end iSCSI or ATA-over-Ethernet
storage.

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH] dm-cache: block level disk cache target for device mapper

2006-11-27 Thread Eric Van Hensbergen

This is the first cut of a device-mapper target which provides a write-back
or write-through block cache.  It is intended to be used in conjunction with
remote block devices such as iSCSI or ATA-over-Ethernet, particularly in
cluster situations.

In performance tests with iSCSI, gave peformance improvements of 2-10x that
of iSCSI alone when Postmark or Bonnie loads were applied from 8 clients to
a single server.  Evidence suggests even greater differences on larger
clusters.  A detailed performance analysis will be vailable shortly via a
technical report on IBM's CyberDigest.

This module was developed during an intership at IBM Research by
Ming Zhao.  Please direct comments to both Ming and myself.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>
---
 drivers/md/Kconfig|6 
 drivers/md/Makefile   |1 
 drivers/md/dm-cache.c | 1465 +
 3 files changed, 1472 insertions(+), 0 deletions(-)

diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index c92c152..0f23a15 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -261,6 +261,12 @@ config DM_MULTIPATH_EMC
---help---
  Multipath support for EMC CX/AX series hardware.
 
+config DM_CACHE
+   tristate "Cache target support (EXPERIMENTAL)"
+   depends on BLK_DEV_DM && EXPERIMENTAL
+   ---help---
+ Support for generic cache target for device-mapper.
+
 endmenu
 
 endif
diff --git a/drivers/md/Makefile b/drivers/md/Makefile
index 34957a6..49f7266 100644
--- a/drivers/md/Makefile
+++ b/drivers/md/Makefile
@@ -36,6 +36,7 @@ obj-$(CONFIG_DM_MULTIPATH_EMC)+= dm-emc
 obj-$(CONFIG_DM_SNAPSHOT)  += dm-snapshot.o
 obj-$(CONFIG_DM_MIRROR)+= dm-mirror.o
 obj-$(CONFIG_DM_ZERO)  += dm-zero.o
+obj-$(CONFIG_DM_CACHE) += dm-cache.o
 
 quiet_cmd_unroll = UNROLL  $@
   cmd_unroll = $(PERL) $(srctree)/$(src)/unroll.pl $(UNROLL) \
diff --git a/drivers/md/dm-cache.c b/drivers/md/dm-cache.c
new file mode 100755
index 000..209bae0
--- /dev/null
+++ b/drivers/md/dm-cache.c
@@ -0,0 +1,1465 @@
+/*
+ * dm-cache.c
+ * Device mapper target for block-level disk caching
+ *
+ * Copyright (C) International Business Machines Corp., 2006
+ * Author: Ming Zhao ([EMAIL PROTECTED])
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "dm.h"
+#include "dm-io.h"
+#include "dm-bio-list.h"
+#include "kcopyd.h"
+
+#define DMC_DEBUG  0
+
+#define DM_MSG_PREFIX  "dm-cache"
+#define DMC_PREFIX "dm-cache: "
+
+#if DMC_DEBUG
+#define DPRINTK( s, arg... ) printk(DMC_PREFIX s "\n", ##arg)
+#else
+#define DPRINTK( s, arg... )
+#endif
+
+#define WRITE_THROUGH  0
+#define WRITE_BACK 1
+#define DEFAULT_WRITE_POLICY   WRITE_BACK
+
+#define DMCACHE_COPY_PAGES 1024
+#define DEFAULT_CACHE_SIZE 65536
+#define DEFAULT_CACHE_ASSOC1024
+#define DEFAULT_BLOCK_SIZE 8
+#define CONSECUTIVE_BLOCKS 128
+
+#define HASH   0
+#define UNIFORM1
+#define DEFAULT_HASHFUNC   UNIFORM
+
+/* states of a cache block */
+#define INVALID0
+#define VALID  1   /* Valid */
+#define RESERVED   2   /* Allocated but data not in place yet */
+#define DIRTY  4   /* Locally modified */
+#define WRITEBACK  8   /* In the process of write back */
+
+/*
+ * cache: maps a cache range of a device.
+ */
+struct cache_c {
+   struct dm_dev *src_dev; /* Source device */
+   struct dm_dev *cache_dev;   /* Cache device */
+   struct kcopyd_client *kcp_client;   /* Kcopyd client for writing 
back data */
+
+   struct cacheblock *cache;   /* Hash table for cache blocks */
+   sector_t size;  /* Cache size */
+   unsigned int bits;  /* Cache size in bits */
+   unsigned int assoc; /* Cache associativity */
+   unsigned int block_size;/* Cache block size */
+   unsigned int block_shift;   /* Cache block size in bits */
+   unsigned int block_mask;

[RFC][PATCH] dm-cow: copy-on-write stackable target for device-mapper

2006-11-27 Thread Eric Van Hensbergen

Subject: [RFC] [PATCH] dm-cow: copy-on-write stackable target for device-mapper

This is the first cut of a device-mapper target which allows stacking of
multiple block devices and in which the top-layer of the stack is a 
copy-on-write layer.  It was originally developed in support of a cluster 
image management solution.

Existing device mapper snapshot facilities could be used to implement 
stackable block devices, as they support a copy-on-write mechanism 
for taking snapshots of logical volumes.  However, benchmarks (using
bonnie++) of such solutions showed an order of magnitude performance 
degredation.  This target was written in an attempt to provide a stacking 
and copy-on-write solution which would incur only a minimal overhead.

Detailed performance results will be available shortly in a technical-report
available via IBM's CyberDigest website -- however, initial results obtained 
with bonnie++ show dramatic performance improvements.

The code within this module was developed by an intern (Gorka
Guardiola) during his stay with us at IBM Research.  Please direct comments
both to myself and Gorka.

Signed-off-by: Eric Van Hensbegren <[EMAIL PROTECTED]>
---
 Documentation/device-mapper/dm-cow.txt |   29 +
 drivers/md/Kconfig |   15 +
 drivers/md/Makefile|1 
 drivers/md/dm-cow.c|  926 
 4 files changed, 971 insertions(+), 0 deletions(-)

diff --git a/Documentation/device-mapper/dm-cow.txt 
b/Documentation/device-mapper/dm-cow.txt
new file mode 100644
index 000..2b18ee6
--- /dev/null
+++ b/Documentation/device-mapper/dm-cow.txt
@@ -0,0 +1,29 @@
+This is a target for the dm-mapper which stacks
+block devices. The base image B is a formatted block
+device. Over that go N read only block devices R
+and then 1 write device W. It does copy on write
+of the devices, and reads from the appropiate device.
+You start by formatting B. Then add a W on it.
+W consists on two parts, a block device for the bitmap
+which should start zeroed and which gets some magic number
+on it the first time it is used. The you can add another
+W and the first W turns into a R. WRB. and so on.
+
+A simple script file is included to format WB.
+The way it works is the standard dmsetup calls for
+the device mapper. The arguments are
+
+N M logdevname Bdevname Boffset Rbitmapdev Rdevname Roffset Wbitmapdev 
Wdevname Woffset 
+
+N and M are the offsets on the logical device /dev/mapper/logdevname
+
+Bdevname is the base image device name
+Boffset is the offset on the base image device
+Rbitmapdev is the block device for the bitmap on a read only device
+
+and so on.
+
+a real world example of BW:
+
+0 8385866 cow /dev/sdb1 0 /dev/sdb2 /dev/sdb3 0
+
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index c92c152..dc86099 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -261,6 +261,21 @@ config DM_MULTIPATH_EMC
---help---
  Multipath support for EMC CX/AX series hardware.
 
+config DM_COW
+   tristate "Copy-on-write Stackable target (EXPERIMENTAL)"
+   depends on BLK_DEV_DM && EXPERIMENTAL
+   ---help---
+  This is a target for the dm-mapper which stacks
+  block devices. The base image B is a formatted block
+  device. Over that go N read only block devices R
+  and then 1 write device W. It does copy on write
+  of the devices, and reads from the appropiate device.
+  You start by formatting B. Then add a W on it.
+  W consists on two parts, a block device for the bitmap
+  which should start zeroed and which gets some magic number
+  on it the first time it is used. The you can add another
+  W and the first W turns into a R. WRB. and so on.
+
 endmenu
 
 endif
diff --git a/drivers/md/Makefile b/drivers/md/Makefile
index 34957a6..8a3d79f 100644
--- a/drivers/md/Makefile
+++ b/drivers/md/Makefile
@@ -36,6 +36,7 @@ obj-$(CONFIG_DM_MULTIPATH_EMC)+= dm-emc
 obj-$(CONFIG_DM_SNAPSHOT)  += dm-snapshot.o
 obj-$(CONFIG_DM_MIRROR)+= dm-mirror.o
 obj-$(CONFIG_DM_ZERO)  += dm-zero.o
+obj-$(CONFIG_DM_COW)   += dm-cow.o
 
 quiet_cmd_unroll = UNROLL  $@
   cmd_unroll = $(PERL) $(srctree)/$(src)/unroll.pl $(UNROLL) \
diff --git a/drivers/md/dm-cow.c b/drivers/md/dm-cow.c
new file mode 100644
index 000..e77060e
--- /dev/null
+++ b/drivers/md/dm-cow.c
@@ -0,0 +1,926 @@
+/*
+ * dm-cow.c
+ * Device mapper target for block-level disk caching
+ * 
+ * Copyright (C) International Business Machines Corp., 2006
+ * Author: Gorka Guardiola and Eric Van Hensbergen ([EMAIL PROTECTED])
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will b

Re: FUSE merging?

2005-09-03 Thread Eric Van Hensbergen

On 9/3/05, Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> > While FUSE doesn't handle it directly, doesn't it have to punt it to
> > its network file systems, how to the sshfs and what not handle this
> > sort of mapping?
> 
> Sshfs handles it by not handling it.  In this case it is neither
> possible, nor needed to be able to correctly map the id space.
> 
> Yes, it may confuse the user.  It may even confuse the kernel for
> sticky directories(*).  But basically it just works, and is very
> simple.
> 

In principal, Plan 9 file servers handle permission checking
server-side, so we could likewise punt -- but it seemed a good idea to
have some form of mapping for directory listings (and things like
sticky directories) to make sense.

   -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: FUSE merging?

2005-09-03 Thread Eric Van Hensbergen

On 9/3/05, Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> 
> > I agree that lots of people would like the functionality.  I regret that
> > although it appears that v9fs could provide it,
> 
> I think you are wrong there.  You don't appreciate all the complexity
> FUSE _lacks_ by not being network transparent.  Just look at the error
> text to errno conversion muck that v9fs has.  And their problems with
> trying to do generic uid/gid mappings.
>

While FUSE doesn't handle it directly, doesn't it have to punt it to
its network file systems, how to the sshfs and what not handle this
sort of mapping?  Not really a criticism, just curious.  This doesn't
so much relate to FUSE, but I've been wrestling with what to do about
this chunk of (mapping) code -- it seems like it might be a good idea
to have some common code shared amongst the networked file systems to
handle this sort of thing.  The NFS idmapd service seems
overcomplicated, but something like that in the common code could
provide the same level of service.  What do folks think? Should
someone (me?) take a whack at a common id mapping service for the
kernel (or just extract idmapd from NFS) -- or is this something
better implemented filesystem-to-filesystem?

> > there seems to be no interest in working on that.
> 
> It would mean adding a plethora of extensions to the 9P protocol, that
> would take away all it's beauty.  I think you should realize that
> these are different interfaces for different purposes. There may be
> some overlap, but not enough to warrant trying to massage them into
> one big ball.
> 

A very good point.  I toyed with the idea of looking at creating a
FUSE-API-compatible v9fs file server library - but there are a good
deal of features (like extended attributes) that we don't have
provisions for in the protocol -- and most likely a good deal of
complexity supporting some of these features  that we may not want to
deal with just yet.

Miklos is right, for the moment FUSE and v9fs have some overlap, but
they remain very different things.  FUSE is far more focused on
delivering user-space file servers, and as such has a better solution
for developing user-space file servers.  We are still focusing on
getting the core of v9fs worked out, when we eventually have that
working smoothly, I like to think we'd be able to spend some time
developing a file server SDK as rich as FUSE (perhaps something
API-compatible as I mentioned before) -- but we want to focus on
getting the core protocol implementation right first - since it has
uses beyond user-space file servers.

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[-mm PATCH] v9fs: cleanup fd transport

2005-08-31 Thread Eric Van Hensbergen

[PATCH] v9fs: cleanup fd transport

Signed-off-by: Latchesar Ionkov <[EMAIL PROTECTED]>
Signed-off-by: Eric Van Hensbegren <[EMAIL PROTECTED]>

---
commit a1949213f1723a7b8bba8edfa118985460d31604
tree 40224cafbfb68543c60a8e0f04ae669cba2cedf7
parent 3f92b2539fe581ee9011d687fbd43cebb641465e
author Eric Van Hensbergen <[EMAIL PROTECTED]> Wed, 31 Aug 2005 16:02:42
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Wed, 31 Aug 2005
16:02:42 -0500

 fs/9p/trans_fd.c |   42 +++---
 fs/9p/v9fs.c |5 -
 2 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/fs/9p/trans_fd.c b/fs/9p/trans_fd.c
--- a/fs/9p/trans_fd.c
+++ b/fs/9p/trans_fd.c
@@ -56,6 +56,9 @@ static int v9fs_fd_recv(struct v9fs_tran
 {
struct v9fs_trans_fd *ts = trans ? trans->priv : NULL;
 
+   if (!trans || trans->status != Connected || !ts)
+   return -EIO;
+
return kernel_read(ts->in_file, ts->in_file->f_pos, v, len);
 }
 
@@ -73,6 +76,9 @@ static int v9fs_fd_send(struct v9fs_tran
mm_segment_t oldfs = get_fs();
int ret = 0;
 
+   if (!trans || trans->status != Connected || !ts)
+   return -EIO;
+
set_fs(get_ds());
/* The cast to a user pointer is valid due to the set_fs() */
ret = vfs_write(ts->out_file, (void __user *)v, len,
&ts->out_file->f_pos);
@@ -95,6 +101,11 @@ v9fs_fd_init(struct v9fs_session_info *v
struct v9fs_trans_fd *ts = NULL;
struct v9fs_transport *trans = v9ses->transport;
 
+   if((v9ses->wfdno == ~0) || (v9ses->rfdno == ~0)) {
+   printk(KERN_ERR "v9fs: Insufficient options for proto=fd\n");
+   return -ENOPROTOOPT;
+   }
+
sema_init(&trans->writelock, 1);
sema_init(&trans->readlock, 1);
 
@@ -103,11 +114,21 @@ v9fs_fd_init(struct v9fs_session_info *v
if (!ts)
return -ENOMEM;
 
-   trans->priv = ts;
-
ts->in_file = fget( v9ses->rfdno );
ts->out_file = fget( v9ses->wfdno );
 
+   if (!ts->in_file || !ts->out_file) {
+   if (ts->in_file)
+   fput(ts->in_file);
+
+   if (ts->out_file)
+   fput(ts->out_file);
+
+   kfree(ts);
+   return -EIO;
+   }
+
+   trans->priv = ts;
trans->status = Connected;
 
return 0;
@@ -122,7 +143,22 @@ v9fs_fd_init(struct v9fs_session_info *v
 
 static void v9fs_fd_close(struct v9fs_transport *trans)
 {
-   struct v9fs_trans_fd *ts = trans ? trans->priv : NULL;
+   struct v9fs_trans_fd *ts;
+
+   if (!trans) 
+   return;
+
+   trans->status = Disconnected;
+   ts = trans->priv;
+
+   if (!ts)
+   return;
+
+   if (ts->in_file)
+   fput(ts->in_file);
+
+   if (ts->out_file)
+   fput(ts->out_file);
 
kfree(ts);
 }
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -296,11 +296,6 @@ v9fs_session_init(struct v9fs_session_in
case PROTO_FD:
trans_proto = &v9fs_trans_fd;
*v9ses->remotename = 0;
-   if((v9ses->wfdno == ~0) || (v9ses->rfdno == ~0)) {
-   printk(KERN_ERR "v9fs: Insufficient options for 
proto=fd\n");
-   retval = -ENOPROTOOPT;
-   goto SessCleanUp;
-   }
break;
default:
printk(KERN_ERR "v9fs: Bad mount protocol %d\n", v9ses->proto);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] v9fs: Support to force umount

2005-08-31 Thread Eric Van Hensbergen

[PATCH] v9fs: Support to force umount

Support for force umount

Signed-off-by: Latchesar Ionkov <[EMAIL PROTECTED]>
Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit 3f92b2539fe581ee9011d687fbd43cebb641465e
tree cd34696129c3b636b85578f659f260100196dee1
parent 83f1fe3d2adc3746d719e430d0a794de1f151c40
author Eric Van Hensbergen <[EMAIL PROTECTED]> Wed, 31 Aug 2005 15:53:14
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Wed, 31 Aug 2005
15:53:14 -0500

 fs/9p/mux.c   |   20 
 fs/9p/mux.h   |1 +
 fs/9p/v9fs.c  |9 +
 fs/9p/v9fs.h  |4 +---
 fs/9p/vfs_super.c |9 +
 5 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/fs/9p/mux.c b/fs/9p/mux.c
--- a/fs/9p/mux.c
+++ b/fs/9p/mux.c
@@ -331,6 +331,26 @@ v9fs_mux_rpc(struct v9fs_session_info *v
 }
 
 /**
+ * v9fs_mux_cancel_requests - cancels all pending requests
+ *
+ * @v9ses: session info structure
+ * @err: error code to return to the requests
+ */
+void v9fs_mux_cancel_requests(struct v9fs_session_info *v9ses, int err)
+{
+   struct v9fs_rpcreq *rptr;
+   struct v9fs_rpcreq *rreq;
+
+   dprintk(DEBUG_MUX, " %d\n", err);
+   spin_lock(&v9ses->muxlock);
+   list_for_each_entry_safe(rreq, rptr, &v9ses->mux_fcalls, next) {
+   rreq->err = err;
+   }
+   spin_unlock(&v9ses->muxlock);
+   wake_up_all(&v9ses->read_wait);
+}
+
+/**
  * v9fs_recvproc - kproc to handle demultiplexing responses
  * @data: session info structure
  *
diff --git a/fs/9p/mux.h b/fs/9p/mux.h
--- a/fs/9p/mux.h
+++ b/fs/9p/mux.h
@@ -38,3 +38,4 @@ struct v9fs_rpcreq {
 int v9fs_mux_init(struct v9fs_session_info *v9ses, const char
*dev_name);
 long v9fs_mux_rpc(struct v9fs_session_info *v9ses,
  struct v9fs_fcall *tcall, struct v9fs_fcall **rcall);
+void v9fs_mux_cancel_requests(struct v9fs_session_info *v9ses, int
err);
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -414,6 +414,15 @@ void v9fs_session_close(struct v9fs_sess
putname(v9ses->remotename);
 }
 
+/**
+ * v9fs_session_cancel - mark transport as disconnected 
+ * and cancel all pending requests.
+ */
+void v9fs_session_cancel(struct v9fs_session_info *v9ses) {
+   v9ses->transport->status = Disconnected;
+   v9fs_mux_cancel_requests(v9ses, -EIO);
+}
+
 extern int v9fs_error_init(void);
 
 /**
diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h
--- a/fs/9p/v9fs.h
+++ b/fs/9p/v9fs.h
@@ -89,9 +89,7 @@ struct v9fs_session_info *v9fs_inode2v9s
 void v9fs_session_close(struct v9fs_session_info *v9ses);
 int v9fs_get_idpool(struct v9fs_idpool *p);
 void v9fs_put_idpool(int id, struct v9fs_idpool *p);
-int v9fs_get_option(char *opts, char *name, char *buf, int buflen);
-long long v9fs_get_int_option(char *opts, char *name, long long dflt);
-int v9fs_parse_tcp_devname(const char *devname, char **addr, char
**remotename);
+void v9fs_session_cancel(struct v9fs_session_info *v9ses);
 
 #define V9FS_MAGIC 0x01021997
 
diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -257,10 +257,19 @@ static int v9fs_show_options(struct seq_
return 0;
 }
 
+static void
+v9fs_umount_begin(struct super_block *sb)
+{
+   struct v9fs_session_info *v9ses = sb->s_fs_info;
+
+   v9fs_session_cancel(v9ses);
+}
+
 static struct super_operations v9fs_super_ops = {
.statfs = simple_statfs,
.clear_inode = v9fs_clear_inode,
.show_options = v9fs_show_options,
+   .umount_begin = v9fs_umount_begin,
 };
 
 struct file_system_type v9fs_fs_type = {


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.13-rc6-mm2] v9fs: remove sparse bitwise warnings

2005-08-28 Thread Eric Van Hensbergen

[PATCH] v9fs: remove sparse bitwise warnings

Fixed a bunch of cast conversions to remove -Wbitwise warnings from
sparse.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit fec4b0831dba7e27e9531d0566eec1a5646f3e79
tree dfc14f433354a8dcdb049bc8137e7f31d7cbda3e
parent 67fefd3d8da2c41c41dfd9cd69765b74e246f31f
author Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005 17:23:47
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005
17:23:47 -0500

 fs/9p/conv.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/9p/conv.c b/fs/9p/conv.c
--- a/fs/9p/conv.c
+++ b/fs/9p/conv.c
@@ -88,7 +88,7 @@ static inline void buf_put_int16(struct 
 {
buf_check_size(buf, 2);
 
-   *(u16 *) buf->p = cpu_to_le16(val);
+   *(__le16 *) buf->p = cpu_to_le16(val);
buf->p += 2;
 }
 
@@ -96,7 +96,7 @@ static inline void buf_put_int32(struct 
 {
buf_check_size(buf, 4);
 
-   *(u32 *)buf->p = cpu_to_le32(val);
+   *(__le32 *)buf->p = cpu_to_le32(val);
buf->p += 4;
 }
 
@@ -104,7 +104,7 @@ static inline void buf_put_int64(struct 
 {
buf_check_size(buf, 8);
 
-   *(u64 *)buf->p = cpu_to_le64(val);
+   *(__le64 *)buf->p = cpu_to_le64(val);
buf->p += 8;
 }
 
@@ -147,7 +147,7 @@ static inline u16 buf_get_int16(struct c
u16 ret = 0;
 
buf_check_size(buf, 2);
-   ret = le16_to_cpu(*(u16 *)buf->p);
+   ret = le16_to_cpu(*(__le16 *)buf->p);
 
buf->p += 2;
 
@@ -159,7 +159,7 @@ static inline u32 buf_get_int32(struct c
u32 ret = 0;
 
buf_check_size(buf, 4);
-   ret = le32_to_cpu(*(u32 *)buf->p);
+   ret = le32_to_cpu(*(__le32 *)buf->p);
 
buf->p += 4;
 
@@ -171,7 +171,7 @@ static inline u64 buf_get_int64(struct c
u64 ret = 0;
 
buf_check_size(buf, 8);
-   ret = le64_to_cpu(*(u64 *)buf->p);
+   ret = le64_to_cpu(*(__le64 *)buf->p);
 
buf->p += 8;
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND][PATCH 2.6.13-rc6-mm2] v9fs: fix plan9port example in v9fs documentation.

2005-08-28 Thread Eric Van Hensbergen

[PATCH] v9fs: Fix Plan9port example in v9fs documentation.

Resend: to fix typo that I should have caught first time around.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit 678b78b5268b253e21aa818fac25ea13291eafff
tree fc3d94d10d23fedee95091e372c51e1156a0360f
parent 06e00e56fdf2c3e230ff60f6fdab6db789f16e73
author Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005 16:09:12
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005
16:09:12 -0500

 Documentation/filesystems/v9fs.txt |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/filesystems/v9fs.txt
b/Documentation/filesystems/v9fs.txt
--- a/Documentation/filesystems/v9fs.txt
+++ b/Documentation/filesystems/v9fs.txt
@@ -20,7 +20,7 @@ For remote file server:
 
 For Plan 9 From User Space applications (http://swtch.com/plan9)
 
-   mount -t 9P /tmp/ns.root.:0/acme/acme /mnt/9 proto=unix,name=$USER
+   mount -t 9P `namespace`/acme /mnt/9 -o proto=unix,name=$USER
 
 OPTIONS
 ===


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13-rc6-mm2] v9fs: use standard kernel byteswapping routines

2005-08-28 Thread Eric Van Hensbergen

On 8/28/05, Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
> On Sun, Aug 28, 2005 at 04:05:07PM -0500, Eric Van Hensbergen wrote:
> > [PATCH] v9fs: use standard kernel byteswapping routines
> >
> > Originally suggested by hch, we have removed our byteswap code
> > and replaced it with calls to the standard kernel byteswapping code.
> 
> > - buf->p[0] = val;
> > - buf->p[1] = val >> 8;
> > + *(u16 *) buf->p = cpu_to_le16(val);
> 
> *(__le16 *)
> 
> > - ret = buf->p[0] | (buf->p[1] << 8);
> > + ret = le16_to_cpu(*(u16 *)buf->p);
> 
> *(__le16 *) etc.
> 
> Otherwise sparse will warn.
> 

It didn't give me any complaints -- I'm building my kernels with a
recent (updated today) version of sparse and built with C=1 -- am I
not invoking it correctly?

-eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

PATCH 2.6.13-rc7-mm1] v9fs: adjust follow_link and put_link to match new VFS API

2005-08-28 Thread Eric Van Hensbergen

[PATCH] v9fs: adjust follow_link and put_link to match new VFS API

In 2.6.13-rc7 the prototypes for follow_link and put_link were changed
to include support for a cookie to help reclaim resources.  This patch
adjusts their definitions in the v9fs implementation.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit 30bdd61e96418043a07d2da71bcd757a0341113f
tree 3e268ece4b911b960b47b47182972d8f439667da
parent e189afc5ed8102a56f74cb5be91a6bf3e478a06a
author Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005 16:33:42
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005
16:33:42 -0500

 fs/9p/vfs_inode.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -1089,7 +1089,7 @@ static int v9fs_vfs_readlink(struct dent
  *
  */
 
-static int v9fs_vfs_follow_link(struct dentry *dentry, struct nameidata
*nd)
+static void *v9fs_vfs_follow_link(struct dentry *dentry, struct
nameidata *nd)
 {
int len = 0;
char *link = __getname();
@@ -1109,7 +1109,7 @@ static int v9fs_vfs_follow_link(struct d
}
nd_set_link(nd, link);
 
-   return 0;
+   return NULL;
 }
 
 /**
@@ -1119,7 +1119,7 @@ static int v9fs_vfs_follow_link(struct d
  *
  */
 
-static void v9fs_vfs_put_link(struct dentry *dentry, struct nameidata
*nd)
+static void v9fs_vfs_put_link(struct dentry *dentry, struct nameidata
*nd, void *p)
 {
char *s = nd_get_link(nd);
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.13-rc6-mm2] v9fs: fix plan9port example in v9fs documentation.

2005-08-28 Thread Eric Van Hensbergen

[PATCH] v9fs: Fix Plan9port example in v9fs documentation.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit 678b78b5268b253e21aa818fac25ea13291eafff
tree fc3d94d10d23fedee95091e372c51e1156a0360f
parent 06e00e56fdf2c3e230ff60f6fdab6db789f16e73
author Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005 16:09:12
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005
16:09:12 -0500

 Documentation/filesystems/v9fs.txt |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/filesystems/v9fs.txt
b/Documentation/filesystems/v9fs.txt
--- a/Documentation/filesystems/v9fs.txt
+++ b/Documentation/filesystems/v9fs.txt
@@ -20,7 +20,7 @@ For remote file server:
 
 For Plan 9 From User Space applications (http://swtch.com/plan9)
 
-   mount -t 9P /tmp/ns.root.:0/acme/acme /mnt/9 proto=unix,name=$USER
+   mount -t 9P `namsepace`/acme /mnt/9 -o proto=unix,name=$USER
 
 OPTIONS
 ===


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.13-rc6-mm2] v9fs: use standard kernel byteswapping routines

2005-08-28 Thread Eric Van Hensbergen

[PATCH] v9fs: use standard kernel byteswapping routines

Originally suggested by hch, we have removed our byteswap code
and replaced it with calls to the standard kernel byteswapping code.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit 06e00e56fdf2c3e230ff60f6fdab6db789f16e73
tree 6eff647a71c056d133aa0f0a9e0a0ff95af05683
parent f32fc66e311abe9e7167991e6b2d37e7c56dcc72
author Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005 16:03:40
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005
16:03:40 -0500

 fs/9p/conv.c |   28 ++--
 1 files changed, 6 insertions(+), 22 deletions(-)

diff --git a/fs/9p/conv.c b/fs/9p/conv.c
--- a/fs/9p/conv.c
+++ b/fs/9p/conv.c
@@ -88,8 +88,7 @@ static inline void buf_put_int16(struct 
 {
buf_check_size(buf, 2);
 
-   buf->p[0] = val;
-   buf->p[1] = val >> 8;
+   *(u16 *) buf->p = cpu_to_le16(val);
buf->p += 2;
 }
 
@@ -97,10 +96,7 @@ static inline void buf_put_int32(struct 
 {
buf_check_size(buf, 4);
 
-   buf->p[0] = val;
-   buf->p[1] = val >> 8;
-   buf->p[2] = val >> 16;
-   buf->p[3] = val >> 24;
+   *(u32 *)buf->p = cpu_to_le32(val);
buf->p += 4;
 }
 
@@ -108,14 +104,7 @@ static inline void buf_put_int64(struct 
 {
buf_check_size(buf, 8);
 
-   buf->p[0] = val;
-   buf->p[1] = val >> 8;
-   buf->p[2] = val >> 16;
-   buf->p[3] = val >> 24;
-   buf->p[4] = val >> 32;
-   buf->p[5] = val >> 40;
-   buf->p[6] = val >> 48;
-   buf->p[7] = val >> 56;
+   *(u64 *)buf->p = cpu_to_le64(val);
buf->p += 8;
 }
 
@@ -158,7 +147,7 @@ static inline u16 buf_get_int16(struct c
u16 ret = 0;
 
buf_check_size(buf, 2);
-   ret = buf->p[0] | (buf->p[1] << 8);
+   ret = le16_to_cpu(*(u16 *)buf->p);
 
buf->p += 2;
 
@@ -170,9 +159,7 @@ static inline u32 buf_get_int32(struct c
u32 ret = 0;
 
buf_check_size(buf, 4);
-   ret =
-   buf->p[0] | (buf->p[1] << 8) | (buf->p[2] << 16) | (buf->
-   p[3] << 24);
+   ret = le32_to_cpu(*(u32 *)buf->p);
 
buf->p += 4;
 
@@ -184,10 +171,7 @@ static inline u64 buf_get_int64(struct c
u64 ret = 0;
 
buf_check_size(buf, 8);
-   ret = (u64) buf->p[0] | ((u64) buf->p[1] << 8) |
-   ((u64) buf->p[2] << 16) | ((u64) buf->p[3] << 24) |
-   ((u64) buf->p[4] << 32) | ((u64) buf->p[5] << 40) |
-   ((u64) buf->p[6] << 48) | ((u64) buf->p[7] << 56);
+   ret = le64_to_cpu(*(u64 *)buf->p);
 
buf->p += 8;
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] v9fs: fix a problem with named-pipe transport

2005-08-28 Thread Eric Van Hensbergen

[PATCH] v9fs: fix a problem with named-pipe transport

Found the problem. I am not sure why, but unix_mkname in
net/unix/af_unix.c writes a zero byte outside the sockaddr_un parameter.
There is even a comment that it might seem like a bug, but it is not --
I didn't understand the explanation -- it looks like a bug to me :)

The patch that I am attaching sets addr_len parameter of ops->connect to
sizeof(struct sockaddr_un) - 1 and thus ensures that unix_mkname won't
write outside the struct. The patch also checks if the length of the
unix
socket name specified in mount doesn't exceed UNIX_PATH_MAX.

Signed-off-by: Latchesar Ionkov <[EMAIL PROTECTED]>
Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit f32fc66e311abe9e7167991e6b2d37e7c56dcc72
tree 3b2a77e0c674e86aed92823857d33352d93938f3
parent 97bc19b509356dda0145cd19fb9768ac3c88ecda
author Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005 13:29:03
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005
13:29:03 -0500

 fs/9p/trans_sock.c |   24 
 1 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/fs/9p/trans_sock.c b/fs/9p/trans_sock.c
--- a/fs/9p/trans_sock.c
+++ b/fs/9p/trans_sock.c
@@ -202,14 +202,23 @@ static int
 v9fs_unix_init(struct v9fs_session_info *v9ses, const char *dev_name,
   char *data)
 {
-   struct socket *csocket = NULL;
+   int rc;
+   struct socket *csocket;
struct sockaddr_un sun_server;
-   struct v9fs_transport *trans = v9ses->transport;
-   int rc = 0;
+   struct v9fs_transport *trans;
+   struct v9fs_trans_sock *ts;
 
-   struct v9fs_trans_sock *ts =
-   kmalloc(sizeof(struct v9fs_trans_sock), GFP_KERNEL);
+   rc = 0;
+   csocket = NULL;
+   trans = v9ses->transport;
+
+   if (strlen(dev_name) > UNIX_PATH_MAX) {
+   eprintk(KERN_ERR, "v9fs_trans_unix: address too long: %s\n",
+   dev_name);
+   return -ENOMEM;
+   }
 
+   ts = kmalloc(sizeof(struct v9fs_trans_sock), GFP_KERNEL);
if (!ts)
return -ENOMEM;
 
@@ -222,9 +231,8 @@ v9fs_unix_init(struct v9fs_session_info 
sun_server.sun_family = PF_UNIX;
strcpy(sun_server.sun_path, dev_name);
sock_create_kern(PF_UNIX, SOCK_STREAM, 0, &csocket);
-   rc = csocket->ops->connect(csocket,
-  (struct sockaddr *)&sun_server,
-  sizeof(struct sockaddr_un), 0);
+   rc = csocket->ops->connect(csocket, (struct sockaddr *)&sun_server, 
+   sizeof(struct sockaddr_un) - 1, 0); /* -1 *is* important */
if (rc < 0) {
eprintk(KERN_ERR,
"v9fs_trans_unix: problem connecting socket: %s: %d\n",


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.13-rc6-mm2] v9fs: fix handling of malformed 9P messages

2005-08-28 Thread Eric Van Hensbergen

[PATCH] v9fs: fix handling of malformed 9P messages

This patch attempts to do a better job of cleaning up after detecting
errors on the transport.  This should also improve error reporting on
broken connections to servers.

Signed-off-by: Latchesar Ionkov <[EMAIL PROTECTED]>
Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit 97bc19b509356dda0145cd19fb9768ac3c88ecda
tree f12a9e827c949f386cca42b718bac63405e9192d
parent 2b2ebf0cea451ad876ab29159162571b5291f8b7
author Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005 13:11:33
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Sun, 28 Aug 2005
13:11:33 -0500

 fs/9p/error.h  |1 +
 fs/9p/mux.c|   53
+---
 fs/9p/mux.h|1 +
 fs/9p/trans_sock.c |   12 ++--
 4 files changed, 46 insertions(+), 21 deletions(-)

diff --git a/fs/9p/error.h b/fs/9p/error.h
--- a/fs/9p/error.h
+++ b/fs/9p/error.h
@@ -47,6 +47,7 @@ static struct errormap errmap[] = {
{"Operation not permitted", EPERM},
{"wstat prohibited", EPERM},
{"No such file or directory", ENOENT},
+   {"directory entry not found", ENOENT},
{"file not found", ENOENT},
{"Interrupted system call", EINTR},
{"Input/output error", EIO},
diff --git a/fs/9p/mux.c b/fs/9p/mux.c
--- a/fs/9p/mux.c
+++ b/fs/9p/mux.c
@@ -162,18 +162,21 @@ static int v9fs_recv(struct v9fs_session
dprintk(DEBUG_MUX, "waiting for response: %d\n", req->tcall->tag);
ret = wait_event_interruptible(v9ses->read_wait,
   ((v9ses->transport->status != Connected) ||
-   (req->rcall != 0) || dprintcond(v9ses, req)));
+   (req->rcall != 0) || (req->err < 0) || 
+   dprintcond(v9ses, req)));
 
dprintk(DEBUG_MUX, "got it: rcall %p\n", req->rcall);
+
+   spin_lock(&v9ses->muxlock);
+   list_del(&req->next);
+   spin_unlock(&v9ses->muxlock);
+
+   if (req->err < 0)
+   return req->err;
+
if (v9ses->transport->status == Disconnected)
return -ECONNRESET;
 
-   if (ret == 0) {
-   spin_lock(&v9ses->muxlock);
-   list_del(&req->next);
-   spin_unlock(&v9ses->muxlock);
-   }
-
return ret;
 }
 
@@ -245,6 +248,9 @@ v9fs_mux_rpc(struct v9fs_session_info *v
if (!v9ses)
return -EINVAL;
 
+   if (!v9ses->transport || v9ses->transport->status != Connected)
+   return -EIO;
+
if (rcall)
*rcall = NULL;
 
@@ -257,6 +263,7 @@ v9fs_mux_rpc(struct v9fs_session_info *v
tcall->tag = tid;
 
req.tcall = tcall;
+   req.err = 0;
req.rcall = NULL;
 
ret = v9fs_send(v9ses, &req);
@@ -351,16 +358,21 @@ static int v9fs_recvproc(void *data)
}
 
err = read_message(v9ses, rcall, v9ses->maxdata + V9FS_IOHDRSZ);
-   if (err < 0) {
-   kfree(rcall);
-   break;
-   }
spin_lock(&v9ses->muxlock);
-   list_for_each_entry_safe(rreq, rptr, &v9ses->mux_fcalls, next) {
-   if (rreq->tcall->tag == rcall->tag) {
-   req = rreq;
-   req->rcall = rcall;
-   break;
+   if (err < 0) {
+   list_for_each_entry_safe(rreq, rptr, 
&v9ses->mux_fcalls, next) {
+   rreq->err = err;
+   }
+   if(err != -ERESTARTSYS)
+   eprintk(KERN_ERR, 
+   "Transport error while reading message 
%d\n", err);
+   } else {
+   list_for_each_entry_safe(rreq, rptr, 
&v9ses->mux_fcalls, next) {
+   if (rreq->tcall->tag == rcall->tag) {
+   req = rreq;
+   req->rcall = rcall;
+   break;
+   }
}
}
 
@@ -379,9 +391,10 @@ static int v9fs_recvproc(void *data)
spin_unlock(&v9ses->muxlock);
 
if (!req) {
-   dprintk(DEBUG_ERROR,
-   "unexpected response: id %d tag %d\n",
-   rcall->id, rcall->tag);
+   if (err >= 0)
+   dprintk(DEBUG_ERROR,
+   "unexpected

[PATCH 2.6.13-rc6-mm2] v9fs: readlink extended mode check

2005-08-28 Thread Eric Van Hensbergen

LANL reported some issues with random crashes during mount of
legacy protocol servers (9P2000 versus 9P2000.u) -- crash was always
happening in readlink (which should never happen in legacy mode).  Added
some sanity conditionals to the get_inode code which should prevent the
errors LANL was seeing.  Code tested benign through regression.

Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit 4bbf929d3991fde7eeb8754ae10025644637a268
tree bf671c4f29343ef86eb9c00030fa66d06915560b
parent f58a81f47f45c929ea0a1f74f9f15a27d3ad4ded
author Eric Van Hensbergen <[EMAIL PROTECTED]> Tue, 02 Aug 2005 15:40:46
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Tue, 02 Aug 2005
15:40:46 -0500

 fs/9p/vfs_inode.c |   35 ++-
 1 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -44,6 +44,7 @@
 #include "fid.h"
 
 static struct inode_operations v9fs_dir_inode_operations;
+static struct inode_operations v9fs_dir_inode_operations_ext;
 static struct inode_operations v9fs_file_inode_operations;
 static struct inode_operations v9fs_symlink_inode_operations;
 
@@ -232,6 +233,7 @@ v9fs_mistat2unix(struct v9fs_stat *mista
 struct inode *v9fs_get_inode(struct super_block *sb, int mode)
 {
struct inode *inode = NULL;
+   struct v9fs_session_info *v9ses = sb->s_fs_info;
 
dprintk(DEBUG_VFS, "super block: %p mode: %o\n", sb, mode);
 
@@ -250,6 +252,10 @@ struct inode *v9fs_get_inode(struct supe
case S_IFBLK:
case S_IFCHR:
case S_IFSOCK:
+   if(!v9ses->extended) {
+   dprintk(DEBUG_ERROR, "special files without 
extended mode\n");
+   return ERR_PTR(-EINVAL);
+   }
init_special_inode(inode, inode->i_mode,
   inode->i_rdev);
break;
@@ -257,14 +263,21 @@ struct inode *v9fs_get_inode(struct supe
inode->i_op = &v9fs_file_inode_operations;
inode->i_fop = &v9fs_file_operations;
break;
+   case S_IFLNK:
+   if(!v9ses->extended) {
+   dprintk(DEBUG_ERROR, "extended modes used w/o 
9P2000.u\n");
+   return ERR_PTR(-EINVAL);
+   }
+   inode->i_op = &v9fs_symlink_inode_operations;
+   break;
case S_IFDIR:
inode->i_nlink++;
-   inode->i_op = &v9fs_dir_inode_operations;
+   if(v9ses->extended)
+   inode->i_op = &v9fs_dir_inode_operations_ext;
+   else
+   inode->i_op = &v9fs_dir_inode_operations;
inode->i_fop = &v9fs_dir_operations;
break;
-   case S_IFLNK:
-   inode->i_op = &v9fs_symlink_inode_operations;
-   break;
default:
dprintk(DEBUG_ERROR, "BAD mode 0x%x S_IFMT 0x%x\n",
mode, mode & S_IFMT);
@@ -1284,7 +1297,7 @@ v9fs_vfs_mknod(struct inode *dir, struct
return retval;
 }
 
-static struct inode_operations v9fs_dir_inode_operations = {
+static struct inode_operations v9fs_dir_inode_operations_ext = {
.create = v9fs_vfs_create,
.lookup = v9fs_vfs_lookup,
.symlink = v9fs_vfs_symlink,
@@ -1299,6 +1312,18 @@ static struct inode_operations v9fs_dir_
.setattr = v9fs_vfs_setattr,
 };
 
+static struct inode_operations v9fs_dir_inode_operations = {
+   .create = v9fs_vfs_create,
+   .lookup = v9fs_vfs_lookup,
+   .unlink = v9fs_vfs_unlink,
+   .mkdir = v9fs_vfs_mkdir,
+   .rmdir = v9fs_vfs_rmdir,
+   .mknod = v9fs_vfs_mknod,
+   .rename = v9fs_vfs_rename,
+   .getattr = v9fs_vfs_getattr,
+   .setattr = v9fs_vfs_setattr,
+};
+
 static struct inode_operations v9fs_file_inode_operations = {
.getattr = v9fs_vfs_getattr,
.setattr = v9fs_vfs_setattr,


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V9fs-developer] Re: [PATCH 2.6.13-rc3-mm2] v9fs: add fd based transport

2005-07-28 Thread Eric Van Hensbergen

On 7/28/05, Ronald G. Minnich  wrote:
> 
> 
> On Thu, 28 Jul 2005, Christoph Hellwig wrote:
> 
> > Couldn't the two other transports be implemented ontop of this one using
> > a mount helper doing the pipe or tcp setup?
> 
> that's how we did it in the version we did for 2.4. I don't see why not.
> 

I strayed away from doing it this way originally for two reasons -
perhaps both are not really valid:

a) I really disliked requiring a helper application to mount a file
system.  I really wanted to be able to boot a diskless system with no
initrd and have just 9P serving root.  I figured if I could enable
people to use 9P without having a helper app, it would be used by more
folks -- of course the need for things like DNS resolution, etc. that
helper apps tend to provide sort of invalidates this piece of things.

b) I was concerned with additional copy overhead - one of the other
transports which isn't published yet uses shared memory (to
virtualized partitions) and it just seemed easier to deal with that in
the kernel rather than punting to a user-level application -- so in
short, I figured keeping the transport modules in the kernel made
sense.  Of course, that doesn't have anything to do with the socket
interfaces being in the kernel -- I don't think there is any
additional copy overhead when using an fd versus a sock.

That being said, many things may be much easier with a user-level
helper - have user level security modules for instance.

I guess I'm not opposed to removing the TCP and named-pipe transports
if folks think that's a reasonable thing to do -- but I'd like to keep
the modular transport infrastructure to support things like the shared
memory transport.  Of course we also need to get our act in gear and
make a reasonable mount-helper application available -- we've got
three versions right now and two of them rely on the Plan 9 from User
Space packages.

Anybody against taking this path?

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: (v9fs) -mm -> 2.6.13 merge status

2005-07-14 Thread Eric Van Hensbergen

On 7/14/05, Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
> On Friday 15 July 2005 00:04, Christoph Hellwig wrote:
> > normally we prefer a patch per actual change, not per file so the
> > description fits.  Given that all these are pretty trivial fixes one
> > patch would have done it aswell, though.
> >
> > With these changes the code is fine for mainline in my opinion.
> 
> Can I make one more nitpicking comment?
> 
> All these functions can use cpu_to_le*() and le*_to_cpu().
> 

I need to rethink some parts of conv.c, I'll incorporate your
suggestion during the rework.  Thanks Alexey.

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13-rc2-mm2 2/7] v9fs: VFS file, dentry, and directory operations (2.0.2)

2005-07-14 Thread Eric Van Hensbergen

Doh!  Good catch, I'll fix and resubmit - same goes for the formating issues.

On 7/14/05, Christoph Hellwig <[EMAIL PROTECTED]> wrote:
> > @@ -383,9 +379,10 @@ v9fs_file_write(struct file *filp, const
> >   return -ENOMEM;
> >
> >   ret = copy_from_user(buffer, data, count);
> > - if (ret)
> > + if (ret) {
> >   dprintk(DEBUG_ERROR, "Problem copying from user\n");
> > - else
> > + return -EFAULT;
> > + } else
> >   ret = v9fs_write(filp, buffer, count, offset);
> >
> >   kfree(buffer);
> 
> Aren't you leaking buffer in the error case?  Also we Linux people really
> hate an else clause when the if block contains a return statement ;-)
> 
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13-rc2-mm2 5/7] v9fs: 9P protocol implementation (2.0.2)

2005-07-14 Thread Eric Van Hensbergen

On 7/14/05, Christoph Hellwig <[EMAIL PROTECTED]> wrote:
> > +static inline void buf_check_size(struct cbuf *buf, int len)
> > +{
> > + if (buf->p+len > buf->ep) {
> > + if (buf->p < buf->ep) {
> > + eprintk(KERN_ERR, "buffer overflow\n");
> > + buf->p = buf->ep + 1;
> > + }
> > + }
> > +}
> 
> "handling" a buffer overflow with a printk doesn't seem appopinquate.
> In what cases can this happen and what problems may it cause?
> 

I believe all of these cases represent what we would consider to be
protocol errors.  I suppose it is possible that our truncation
approach could be used as an exploit in some weird case -- I'll take a
look at fixing things so that any such overflow case is treated as a
fatal protocol error and reported as such (via the protocol as
appropriate).

  -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: (v9fs) -mm -> 2.6.13 merge status

2005-07-13 Thread Eric Van Hensbergen

On 6/27/05, Christoph Hellwig <[EMAIL PROTECTED]> wrote:
> 
> That beeing said there's a few issues with the code still I'd like to
> see fixed:
>

Sorry I didn't get to these quicker - was on vacation and basically
off-line for the past week and a half.  I've made 90% of the changes
suggested and committed them to my git tree, I'll combine the changes
into a single patch and then split them by file-group before sending
them to akpm to more closely match the existing patches.  The 10% I
didn't address I'll comment on below, most of them represent harder
problems that I'd like to think about a bit more.

>
>   - there's three sparse warnings still.  Two of them are easily fixed
> by moving externs to headers, one doesn't look fixable until we get
> a sane in-kernel api for socket operations

done

>   - some dentry handling looks rather odd.  Why are you for example
> calling d_drop in v9fs_vfs_symlink, v9fs_vfs_mknod and v9fs_vfs_link?
> Shouldn't all these call d_instantatiate to actually reuse the
> dentry as in v9fs_vfs_create?  Also what's the issue with
> v9fs_fid_insert?  It would seem better and more logical to me to
> always set d_fsdata in create/mknod/symlink/open before hashing it
> and then beeing able to rely on it beeing non-NULL.

All of this is kind of tricky due to the association of fids with
dentry elements and the special way we handle certain features (such
as special files and symlinks).  The current code aggressively
invalidates fids to prevent the dcache from masking operations that
may be semantically important to synthetic file systems.  If you look
in v9fs_create we actually d_drop the dentry for created directories
as well.  The only reason we don't d_drop normal files is because we
are trying to preserve the atomic create/open semantics.

I'm not 100% confident this is the right solution, but its the closest
I've been able to come so far -- there's actually been a fair amount
of discussion on this in the v9fs-developer's list.  If you want more
details, it's probably worth a separate thread to discuss the reasons
behind why we want to aggressively invalidate the dcache and how we
have tried to accomplish this -- or we could just catch up at OLS.

>   - buf_check_sizep, buf_check_size and buf_check_sizev should be made
> inlines, and lose the implict return.  Please don't hide such
> things in macros

done

>   - please avoid using hlist_for_each, usually hlist_for_each_entry is
> a much better choice
>   - dito for list_for_each_safe vs list_for_each_entry_safe

done

>   - can you please check whether lib/idr.c fullfills your needs so we
> can get rid of idpool.c?

Last time I looked idr didn't do exactly what I wanted, but looking
over it again I realize its just doing more than I want -- so I've
eliminated idpool.*, but still have wrapper functions to encapsulate
locking and retry -- it strikes me that there may be a case for
generalizing these wrapper functions and putting them in lib/idr.c,
but figured that could wait.

>   - v9fs_inode2v9ses has lots of useless checks, inode->i_sb can never
> be NULL, and inode->i_sb->s_fs_info can't be either once set in
> fill_inode, which is before the first inode on the filesystem is
> created.  Also the argument is never NULL.  Because of that you
> can also kill all the return value checks in the callers.
>   - do you really need to keep v9fs_dentry_delete just for the dprintk?
>   - no need to check for a NULL file in v9fs_dir_readdir, the VFS gurantees
> it's not.  And if it was you'd better be off panic because something
> is enormously fscked.
>   - Dito for v9fs_file_open
>   - And the inode in v9fs_file_lock
>   - And dir, file, file->d_inode, sb, v9ses in v9fs_remove.
>   - And dir, sb and v9ses in v9fs_vfs_lookup
>   - And dir, sb and v9ses in v9fs_vfs_symlink
>   - And dir, sb and v9ses in v9fs_vfs_link
>   - And dir, sb and v9ses in v9fs_vfs_mknod

Yeah, all of these were sanity checks during initial development while
I was still understanding the VFS API.  I think I got most of them
this time.

>   - copy_from_user returns the bytes actually copied in the failure case,
> but you should return -EFAULT instead of that number in v9fs_file_write

fixed

>   - No need to implement v9fs_file_mmap, do_mmap_pgoff makes sure to error
> out if it's not present (and actually returns the correct errno)
>   - I think it's pretty similar for all these checks for fid (=private_data)
> checks.  You always set them in open, so they can't be NULL
>   - kfree can be called with a NULL argument just fine, you can remove
> lots of ifs for that. You also often set pointers to NULL just before
> freeing a structure - that's pretty useless as slab debugging will
> catch bugs with stary references very well, and overwrites these NULLs
> ASAP.
>   - The call to ->put_inode in the error case of v9fs_get_inode is very
> wrong.  You'd actually pani

Re: [RFC] FUSE permission modell (Was: fuse review bits)

2005-04-19 Thread Eric Van Hensbergen

Somewhat related question for Viro/the group:

Why is CLONE_NEWNS considered a priveledged operation?  Would placing
limits on the number of private namespaces a user can own solve any
resource concerns or is there something more nefarious I'm missing?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] FUSE permission modell (Was: fuse review bits)

2005-04-19 Thread Eric Van Hensbergen

On 4/19/05, Bodo Eggert <[EMAIL PROTECTED]> wrote:
> >
> > Well, that would kinda be the intent behind the permissions file  --
> > it can specify what restricted set of images/devices/whatever the user
> > can mount, I suppose the sensible thing would be to always enforce
> > nosuid and nsgid, but I'd rather keep these as the default version of
> > options (allowing admins to shoot themselves in the foot perhaps, but
> > in the single-user workstation case, is seems like there's less reason
> > to be so paranoid).
> 
> I think you shouldn't help the admins by creating shoes with target marks.
>

Fair enough.  Since I don't really have any cases I can think of that
require this sort of behavior, I'll back off on allowing user mounts
with suid or sgid enabled.

> 
> Allowing user mounts with no* should be allways ok (no config needed
> besides the ulimit), and mounting specified files to defined locations
> is allready supported by fstab.
>

Do folks think that the limits should be per-user or per-process for
user-mounts, what about separate limits for # of private namespaces
and # of mounts?

The fstab support doesn't seem to provide enough flexibility for
certain situations, say I want to support mounting any remote file
system, as long as its in the user's private hierarchy?   What if I
want user's to be able to mount FUSE, v9fs, etc. user-space file
systems, but only in a private namespace and only in their private
hierarchy?  Or are these situations which you think should "always be
okay" as long as nosuid and nogid (and newns?) are implicit?

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] FUSE permission modell (Was: fuse review bits)

2005-04-19 Thread Eric Van Hensbergen

On 4/17/05, Bodo Eggert <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]> wrote:
> 
> > I was thinking about this a while back and thought having a user-mount
> > permissions file might be the right way to address lots of these
> > issues.  Essentially it would contain information about what
> > users/groups were allowed to mount what sources to what destinations
> > and with what mandatory options.
> 
> Users being able to mount random fs containing suid or device nodes
> are root whenever they want to. If you want to mount with dev or suid,
> use sudo and restrict the mount to a limited set of images/devices/whatever.
>

Well, that would kinda be the intent behind the permissions file  --
it can specify what restricted set of images/devices/whatever the user
can mount, I suppose the sensible thing would be to always enforce
nosuid and nsgid, but I'd rather keep these as the default version of
options (allowing admins to shoot themselves in the foot perhaps, but
in the single-user workstation case, is seems like there's less reason
to be so paranoid).

   -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] FUSE permission modell (Was: fuse review bits)

2005-04-17 Thread Eric Van Hensbergen

On 4/17/05, Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> > >
> > >   1) Only allow mount over a directory for which the user has write
> > >  access (and is not sticky)
> > >
> > >   2) Use nosuid,nodev mount options
> > >
> > > [ parts deleted ]
> >
> > Do these solve all the security concerns with unprivileged mounts, or
> > are there other barriers/concerns?  Should there be ulimit (or rlimit)
> > style restrictions on how many mounts/binds a user is allowed to have
> > to prevent users from abusing mount privs?
> 
> Currently there is a (configurable) global limit for all non-root FUSE
> mounts.  An additional per-user limit would be nice, but from the
> security standpoint it doesn't matter.
> 
> > I was thinking about this a while back and thought having a user-mount
> > permissions file might be the right way to address lots of these
> > issues.  Essentially it would contain information about what
> > users/groups were allowed to mount what sources to what destinations
> > and with what mandatory options.
> 
> I haven't yet seen the need for such a great flexibility.  Debian
> installs fusermount (the FUSE mount utility) "-rwsr-x--- root fuse",

These are both well and good, but I was looking for a more global
system (for things other than FUSE).

> 
> > Is this unnecessary?  Is this not enough?
> 
> Maybe it is necessary, but why bother until somebody actually wants
> it?  I'm a great believer of the "lazy" development philosophy ;)
> 

Yeah, I guess I'm motivated in that I want to use normal mount to
handle v9fs user file systems, local private mounts, and local private
resource shares.  I'd also like normal users to be able to take better
advantage of -o bind.  I think its kinda silly that we have special
purpose mounts for cifs, samba, fuse, v9fs, etc -- but I suppose
that's more of a user-space util-linux dilemma than a kernel dilemma.

 -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] FUSE permission modell (Was: fuse review bits)

2005-04-17 Thread Eric Van Hensbergen

On 4/11/05, Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> 
>   1) Only allow mount over a directory for which the user has write
>  access (and is not sticky)
> 
>   2) Use nosuid,nodev mount options
> 
> [ parts deleted ]

Do these solve all the security concerns with unprivileged mounts, or
are there other barriers/concerns?  Should there be ulimit (or rlimit)
style restrictions on how many mounts/binds a user is allowed to have
to prevent users from abusing mount privs?

I was thinking about this a while back and thought having a user-mount
permissions file might be the right way to address lots of these
issues.  Essentially it would contain information about what
users/groups were allowed to mount what sources to what destinations
and with what mandatory options.

You can get the start of this with the user/users/etc. stuff in
/etc/fstab, but I was envisioning something a bit more dynamic with
regular expression based rules for sources and destinations.   So,
something like:

# /etc/usermounts: user mount permissions

#  

# allow users to mount any file system under their home directory
*   $HOME   * 
   nosuid, nosgid
# allow users to bind over /usr/bin as long as its only in their
private namespace
*   /usr/bin  
bindnewns
# allow users to loopback mount distributed file systems to /mnt
127.0.0.1  /mnt   *   
 nosuid, nosgid
# allow users to mount files over any directory they have right access to
*   (perm=0222) * 
   nosuid, nosgid

Is this unnecessary?  Is this not enough?

   -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] FUSE permission modell (Was: fuse review bits)

2005-04-17 Thread Eric Van Hensbergen

On 4/12/05, Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> > I think that would be _much_ nicer implemented as a mount which is
> > invisible to other users, rather than one which causes the admin's
> > scripts to spew error messages.
>> 
> > Is the namespace mechanism at all suitable for that?
> 
> It is certainly the right tool for this.  However currently private
> namespaces are quite limited.  The only sane usage I can think of is
> that before mounting the user starts a shell with CLONE_NS, and does
> the mount in this.  However all the other programs he already has
> running (editor, browser, desktop environment) won't be able to access
> the mount.
> 

I'd like to second that I think private-namespaces are the right way
to solve this sort of problem.  It also helps not cluttering the
global namespace with user-local mounts

>
> Shared subtrees and more support in userspace tools is needed before
> private namespaces can become really useful.
> 

I'd like to talk about this a bit more and start driving to a solution
here.  I've been looking at the namespace code quite a bit and was
just about to dive in and start checking into adding/fixing certain
aspects such as stackable namespaces, optional inheritence (changes in
a parent namespace are reflected in the child but not vice-versa),
etc.

One aspect I was thinking about here was a mount flag that would give
you a new private namespace (if you didn't already have one) for the
mount (and I guess that would impact any subsequent mounts from the
user in that shell).  Another option would be a 'newns' style
system-call, but I'm generally against adding new system calls.

Shared subtrees are a tricky one.  I know how we would handle it in
V9FS, but not sure how well that would translate to others
(essentially we'd re-export the subtree so other user's could mount it
individually -- but that's a very Plan 9 solution and may not be what
more UNIX-minded folks would want -- we also need to improve our own
server infrastructure to more efficiently support such a re-export).

So, to sum up I think private namespaces is the right solution, and
I'd rather put effort into making it more useful than work-around the
fact that its not practical right now.

   -eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

94 matches

Mail list logo