Re: [PATCH 1/2] AFS: Fix interminable loop in afs_write_back_from_locked_page()

2007-05-11 Thread David Howells
Andrew Morton [EMAIL PROTECTED] wrote:

 Yes, it's a shame that there doesn't seem to be a fine-grained way of
 turning on -W's useful bits.

You can turn off -W's undesirable bits.  For net/rxrpc/ and fs/afs/ at least,
adding:

CFLAGS += -W -Wno-unused-parameter

to the Makefile generates no warnings.  Perhaps this should be added to the
master Makefile.

Adding -Wsign-compare finds some stuff that I will fix.  It also finds some
stuff in the main and the networking headers.  This is a really useful option
and found some tricky bugs in CacheFiles.  I would endorse adding this
generally too.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] AFS: Fix interminable loop in afs_write_back_from_locked_page()

2007-05-11 Thread David Howells
Andrew Morton [EMAIL PROTECTED] wrote:

  Following bug was uncovered by compiling with '-W' flag:
 
 gcc -W finds a number of fairly scary bugs.

Do you mean in my code specifically?  Or in the kernel in general?  As far as
I can tell -W only finds an eye-glazingly large quantity of 'unused parameter'
warnings in AFS and AF_RXRPC.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] AFS: Fix interminable loop in afs_write_back_from_locked_page()

2007-05-11 Thread David Howells
Andrew Morton [EMAIL PROTECTED] wrote:

 More than one would expect, given that it is recommended in
 Documentation/SubmitChecklist, which everyone reads ;)

Which states incorrectly:

| 22: Newly-added code has been compiled with `gcc -W'.  This will generate
| lots of noise, but is good for finding bugs like warning: comparison
| between signed and unsigned.

-W does not imply -Wsign-compare, at least not on my gcc.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] AF_RXRPC: AF_RXRPC depends on IPv4

2007-05-14 Thread David Howells
Add a dependency for CONFIG_AF_RXRPC on CONFIG_INET.  This fixes this error:

net/built-in.o: In function `rxrpc_get_peer':
(.text+0x42824): undefined reference to `ip_route_output_key'

Signed-off-by: David Howells [EMAIL PROTECTED]
---

 net/rxrpc/Kconfig |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/rxrpc/Kconfig b/net/rxrpc/Kconfig
index 91b3d52..e662f1d 100644
--- a/net/rxrpc/Kconfig
+++ b/net/rxrpc/Kconfig
@@ -4,7 +4,7 @@
 
 config AF_RXRPC
tristate RxRPC session sockets
-   depends on EXPERIMENTAL
+   depends on INET  EXPERIMENTAL
select KEYS
help
  Say Y or M here to include support for RxRPC session sockets (just

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] AF_RXRPC: Make call state names available if CONFIG_PROC_FS=n

2007-05-14 Thread David Howells
Make the call state names array available even if CONFIG_PROC_FS is disabled
as it's used in other places (such as debugging statements) too.

Signed-off-by: David Howells [EMAIL PROTECTED]
---

 net/rxrpc/ar-call.c |   19 +++
 net/rxrpc/ar-proc.c |   19 ---
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/net/rxrpc/ar-call.c b/net/rxrpc/ar-call.c
index 4d92d88..3c04b00 100644
--- a/net/rxrpc/ar-call.c
+++ b/net/rxrpc/ar-call.c
@@ -15,6 +15,25 @@
 #include net/af_rxrpc.h
 #include ar-internal.h
 
+const char *rxrpc_call_states[] = {
+   [RXRPC_CALL_CLIENT_SEND_REQUEST]= ClSndReq,
+   [RXRPC_CALL_CLIENT_AWAIT_REPLY] = ClAwtRpl,
+   [RXRPC_CALL_CLIENT_RECV_REPLY]  = ClRcvRpl,
+   [RXRPC_CALL_CLIENT_FINAL_ACK]   = ClFnlACK,
+   [RXRPC_CALL_SERVER_SECURING]= SvSecure,
+   [RXRPC_CALL_SERVER_ACCEPTING]   = SvAccept,
+   [RXRPC_CALL_SERVER_RECV_REQUEST]= SvRcvReq,
+   [RXRPC_CALL_SERVER_ACK_REQUEST] = SvAckReq,
+   [RXRPC_CALL_SERVER_SEND_REPLY]  = SvSndRpl,
+   [RXRPC_CALL_SERVER_AWAIT_ACK]   = SvAwtACK,
+   [RXRPC_CALL_COMPLETE]   = Complete,
+   [RXRPC_CALL_SERVER_BUSY]= SvBusy  ,
+   [RXRPC_CALL_REMOTELY_ABORTED]   = RmtAbort,
+   [RXRPC_CALL_LOCALLY_ABORTED]= LocAbort,
+   [RXRPC_CALL_NETWORK_ERROR]  = NetError,
+   [RXRPC_CALL_DEAD]   = Dead,
+};
+
 struct kmem_cache *rxrpc_call_jar;
 LIST_HEAD(rxrpc_calls);
 DEFINE_RWLOCK(rxrpc_call_lock);
diff --git a/net/rxrpc/ar-proc.c b/net/rxrpc/ar-proc.c
index 58f4b4e..1c0be0e 100644
--- a/net/rxrpc/ar-proc.c
+++ b/net/rxrpc/ar-proc.c
@@ -25,25 +25,6 @@ static const char *rxrpc_conn_states[] = {
[RXRPC_CONN_NETWORK_ERROR]  = NetError,
 };
 
-const char *rxrpc_call_states[] = {
-   [RXRPC_CALL_CLIENT_SEND_REQUEST]= ClSndReq,
-   [RXRPC_CALL_CLIENT_AWAIT_REPLY] = ClAwtRpl,
-   [RXRPC_CALL_CLIENT_RECV_REPLY]  = ClRcvRpl,
-   [RXRPC_CALL_CLIENT_FINAL_ACK]   = ClFnlACK,
-   [RXRPC_CALL_SERVER_SECURING]= SvSecure,
-   [RXRPC_CALL_SERVER_ACCEPTING]   = SvAccept,
-   [RXRPC_CALL_SERVER_RECV_REQUEST]= SvRcvReq,
-   [RXRPC_CALL_SERVER_ACK_REQUEST] = SvAckReq,
-   [RXRPC_CALL_SERVER_SEND_REPLY]  = SvSndRpl,
-   [RXRPC_CALL_SERVER_AWAIT_ACK]   = SvAwtACK,
-   [RXRPC_CALL_COMPLETE]   = Complete,
-   [RXRPC_CALL_SERVER_BUSY]= SvBusy  ,
-   [RXRPC_CALL_REMOTELY_ABORTED]   = RmtAbort,
-   [RXRPC_CALL_LOCALLY_ABORTED]= LocAbort,
-   [RXRPC_CALL_NETWORK_ERROR]  = NetError,
-   [RXRPC_CALL_DEAD]   = Dead,
-};
-
 /*
  * generate a list of extant and dead calls in /proc/net/rxrpc_calls
  */

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] net/rxrpc/ar-connection.c: fix NULL dereference

2007-06-18 Thread David Howells
Adrian Bunk [EMAIL PROTECTED] wrote:

 This patch fixes a NULL dereference spotted by the Coverity checker.
 
 Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

Acked-by: David Howells [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC: 2.6 patch] net/rxrpc/ar-output.c: remove dead code

2007-06-18 Thread David Howells
Adrian Bunk [EMAIL PROTECTED] wrote:

 This patch removes dead code spotted by the Coverity checker.

This is the wrong solution.  'copied' should be updated.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] AF_RXRPC: Return the number of bytes buffered in rxrpc_send_data()

2007-06-18 Thread David Howells
Return the number of bytes buffered in rxrpc_send_data().

Signed-off-by: David Howells [EMAIL PROTECTED]
---

 net/rxrpc/ar-output.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/net/rxrpc/ar-output.c b/net/rxrpc/ar-output.c
index 591c442..cc9102c 100644
--- a/net/rxrpc/ar-output.c
+++ b/net/rxrpc/ar-output.c
@@ -640,6 +640,7 @@ static int rxrpc_send_data(struct kiocb *iocb,
goto efault;
sp-remain -= copy;
skb-mark += copy;
+   copied += copy;
 
len -= copy;
segment -= copy;
@@ -709,6 +710,8 @@ static int rxrpc_send_data(struct kiocb *iocb,
 
} while (segment  0);
 
+success:
+   ret = copied;
 out:
call-tx_pending = skb;
_leave( = %d, ret);
@@ -725,7 +728,7 @@ call_aborted:
 
 maybe_error:
if (copied)
-   ret = copied;
+   goto success;
goto out;
 
 efault:

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] FRV: Fix {dis,en}able_irq_lockdep_irqrestore compile error

2006-09-05 Thread David Howells

Fix the lack of certain non-LOCKDEP stub functions in linux/interrupt.h and
also provide FRV with LOCKDEP variants.

This is to be applied to -mm kernel since not all of the functions added exist
in the main kernel.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 frv-irq-lockdep-2618rc5mm1.diff 
 include/asm-frv/irq.h |   43 +++
 include/linux/interrupt.h |2 ++
 2 files changed, 45 insertions(+)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/include/asm-frv/irq.h 
linux-2.6.18-rc5-mm1-frv/include/asm-frv/irq.h
--- ../kernels/linux-2.6.18-rc5-mm1/include/asm-frv/irq.h   2006-09-04 
18:02:48.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/include/asm-frv/irq.h  2006-09-05 
15:59:08.0 +0100
@@ -39,5 +39,48 @@ extern void disable_irq_nosync(unsigned 
 extern void disable_irq(unsigned int irq);
 extern void enable_irq(unsigned int irq);
 
+#ifdef CONFIG_LOCKDEP
+/*
+ * Special lockdep variants of irq disabling/enabling.
+ * These should be used for locking constructs that
+ * know that a particular irq context which is disabled,
+ * and which is the only irq-context user of a lock,
+ * that it's safe to take the lock in the irq-disabled
+ * section without disabling hardirqs.
+ *
+ * On !CONFIG_LOCKDEP they are equivalent to the normal
+ * irq disable/enable methods.
+ */
+static inline void disable_irq_nosync_lockdep(unsigned int irq)
+{
+   disable_irq_nosync(irq);
+   local_irq_disable();
+}
+
+static inline void disable_irq_nosync_lockdep_irqsave(unsigned int irq, 
unsigned long *flags)
+{
+   disable_irq_nosync(irq);
+   local_irq_save(*flags);
+}
+
+static inline void disable_irq_lockdep(unsigned int irq)
+{
+   disable_irq(irq);
+   local_irq_disable();
+}
+
+static inline void enable_irq_lockdep(unsigned int irq)
+{
+   local_irq_enable();
+   enable_irq(irq);
+}
+
+static inline void enable_irq_lockdep_irqrestore(unsigned int irq, unsigned 
long *flags)
+{
+   local_irq_restore(*flags);
+   enable_irq(irq);
+}
+#endif /* CONFIG_LOCKDEP */
+
 
 #endif /* _ASM_IRQ_H_ */
diff -urp ../kernels/linux-2.6.18-rc5-mm1/include/linux/interrupt.h 
linux-2.6.18-rc5-mm1-frv/include/linux/interrupt.h
--- ../kernels/linux-2.6.18-rc5-mm1/include/linux/interrupt.h   2006-09-04 
18:03:31.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/include/linux/interrupt.h  2006-09-05 
15:58:53.0 +0100
@@ -178,6 +178,8 @@ static inline int disable_irq_wake(unsig
 #  define disable_irq_nosync_lockdep(irq)  disable_irq_nosync(irq)
 #  define disable_irq_lockdep(irq) disable_irq(irq)
 #  define enable_irq_lockdep(irq)  enable_irq(irq)
+#  define disable_irq_nosync_lockdep_irqsave(irq, flags) 
disable_irq_nosync(irq)
+#  define enable_irq_lockdep_irqrestore(irq, flags) enable_irq(irq)
 # endif
 
 #endif /* CONFIG_GENERIC_HARDIRQS */
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-05 Thread David Howells

Stop do_gettimeofday() on FRV from using tickadj, and model it after ARM
instead.

This patch also provides a placeholder macro for getting hardware timer data to
be filled in when such is available.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 frv-tickadj-2618rc5mm1.diff 
 arch/frv/kernel/time.c |   20 +---
 1 file changed, 5 insertions(+), 15 deletions(-)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/arch/frv/kernel/time.c 
linux-2.6.18-rc5-mm1-frv/arch/frv/kernel/time.c
--- ../kernels/linux-2.6.18-rc5-mm1/arch/frv/kernel/time.c  2006-09-04 
18:03:14.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/arch/frv/kernel/time.c 2006-09-05 
15:44:42.0 +0100
@@ -31,6 +31,9 @@
 
 #define TICK_SIZE (tick_nsec / 1000)
 
+/* H/W clock data if we can get it (in microseconds) */
+#define FRV_HW_CLOCK_DATA (0)
+
 unsigned long __nongprelbss __clkin_clock_speed_HZ;
 unsigned long __nongprelbss __ext_bus_clock_speed_HZ;
 unsigned long __nongprelbss __res_bus_clock_speed_HZ;
@@ -148,23 +151,10 @@ void do_gettimeofday(struct timeval *tv)
 {
unsigned long seq;
unsigned long usec, sec;
-   unsigned long max_ntp_tick;
 
do {
seq = read_seqbegin(xtime_lock);
-
-   usec = 0;
-
-   /*
-* If time_adjust is negative then NTP is slowing the clock
-* so make sure not to go into next possible interval.
-* Better to lose some accuracy than have time go backwards..
-*/
-   if (unlikely(time_adjust  0)) {
-   max_ntp_tick = (USEC_PER_SEC / HZ) - tickadj;
-   usec = min(usec, max_ntp_tick);
-   }
-
+   usec = FRV_HW_CLOCK_DATA;
sec = xtime.tv_sec;
usec += (xtime.tv_nsec / 1000);
} while (read_seqretry(xtime_lock, seq));
@@ -195,7 +185,7 @@ int do_settimeofday(struct timespec *tv)
 * wall time.  Discover what correction gettimeofday() would have
 * made, and then undo it!
 */
-   nsec -= 0 * NSEC_PER_USEC;
+   nsec -= FRV_HW_CLOCK_DATA * NSEC_PER_USEC;
 
wtm_sec  = wall_to_monotonic.tv_sec + (xtime.tv_sec - sec);
wtm_nsec = wall_to_monotonic.tv_nsec + (xtime.tv_nsec - nsec);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NOMMU: Provide page_mkclean() for NOMMU

2006-09-05 Thread David Howells

Provide a page_mkclean() implementation for NOMMU.  This doesn't do anything
except return successfully as there are no PTEs for it to play with.

This is only relevant to the -mm kernels.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 nommu-page_mkclean-2618rc5mm1.diff 
 include/linux/rmap.h |6 ++
 1 file changed, 6 insertions(+)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/include/linux/rmap.h 
linux-2.6.18-rc5-mm1-frv/include/linux/rmap.h
--- ../kernels/linux-2.6.18-rc5-mm1/include/linux/rmap.h2006-09-04 
18:03:32.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/include/linux/rmap.h   2006-09-05 
15:34:35.0 +0100
@@ -120,6 +120,12 @@ int page_mkclean(struct page *);
 #define page_referenced(page,l) TestClearPageReferenced(page)
 #define try_to_unmap(page, refs) SWAP_FAIL
 
+static inline int page_mkclean(struct page *page)
+{
+   return 0;
+}
+
+
 #endif /* CONFIG_MMU */
 
 /*
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NOMMU: Make lib/ioremap.c conditional

2006-09-05 Thread David Howells

Make lib/ioremap.c conditional on !CONFIG_MMU.  It plays with PTEs which don't
exist under NOMMU conditions.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 nommu-ioremap-2618rc5mm1.diff 
 lib/Makefile |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/lib/Makefile 
linux-2.6.18-rc5-mm1-frv/lib/Makefile
--- ../kernels/linux-2.6.18-rc5-mm1/lib/Makefile2006-09-04 
18:03:32.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/lib/Makefile   2006-09-05 16:01:38.0 
+0100
@@ -5,8 +5,9 @@
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
 bust_spinlocks.o rbtree.o radix-tree.o dump_stack.o \
 idr.o div64.o int_sqrt.o bitmap.o extable.o prio_tree.o \
-sha1.o ioremap.o
+sha1.o
 
+lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
 
 lib-y  += kobject.o kref.o kobject_uevent.o klist.o
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NOMMU: Move the fallback arch_vma_name() to a sensible place

2006-09-05 Thread David Howells

Move the fallback arch_vma_name() to a sensible place (kernel/signal.c).

Currently it's in fs/proc/task_mmu.c, a file that is dependent on both
CONFIG_PROC_FS and CONFIG_MMU being enabled, but it's used from kernel/signal.c
from where it is called unconditionally.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 nommu-arch_vma_name-2618rc5mm1.diff 
 fs/proc/task_mmu.c |5 -
 kernel/signal.c|5 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/fs/proc/task_mmu.c 
linux-2.6.18-rc5-mm1-frv/fs/proc/task_mmu.c
--- ../kernels/linux-2.6.18-rc5-mm1/fs/proc/task_mmu.c  2006-09-04 
18:02:43.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/fs/proc/task_mmu.c 2006-09-05 15:49:18.0 
+0100
@@ -122,11 +122,6 @@ struct mem_size_stats
unsigned long private_dirty;
 };
 
-__attribute__((weak)) const char *arch_vma_name(struct vm_area_struct *vma)
-{
-   return NULL;
-}
-
 static int show_map_internal(struct seq_file *m, void *v, struct 
mem_size_stats *mss)
 {
struct proc_maps_private *priv = m-private;
diff -urp ../kernels/linux-2.6.18-rc5-mm1/kernel/signal.c 
linux-2.6.18-rc5-mm1-frv/kernel/signal.c
--- ../kernels/linux-2.6.18-rc5-mm1/kernel/signal.c 2006-09-04 
18:03:32.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/kernel/signal.c2006-09-05 15:49:19.0 
+0100
@@ -773,6 +773,11 @@ static void pad_len_spaces(int len)
printk(%*c, len, ' ');
 }
 
+__attribute__((weak)) const char *arch_vma_name(struct vm_area_struct *vma)
+{
+   return NULL;
+}
+
 static int print_vma(struct vm_area_struct *vma)
 {
struct mm_struct *mm = vma-vm_mm;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-06 Thread David Howells
john stultz [EMAIL PROTECTED] wrote:

 From this patch it looks like the FRV arch could be trivially converted
 to GENERIC_TIME.
 
 Would you consider the following, totally untested patch?

It certainly looks interesting.  I'll have to study the clocksource stuff -
some FRV CPUs have an effective TSC.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-06 Thread David Howells
Ingo Molnar [EMAIL PROTECTED] wrote:

 btw., would be nice to convert it to genirq (and irqchips) too =B-) That 
 would solve the kind of disable_irq_lockdep() breakage that was reported 
 recently.

I can think of reasons for not using that stuff also.

 (1) Passing struct pt_regs *regs around is a complete waste of resources on
 FRV.  It's in GR28 at all times and can thus be accessed directly.

 (2) All the little operations functions cause unnecessary jumping, jumps that
 icache lookahead can't predict because they're register-indirect.

 (3) ACK'ing and controlling interrupts has to be done by groups.

 (4) No account is taken of interrupt priority.

 (5) The FRV CPU doesn't tell me which IRQ source fired.  Much of the code
 I've got is stuff to try and work it out.  I could just blindly poll all
 the sources attached to a particular interrupt level, but that seems
 somehow less efficient.

David

BTW, have you looked at my patch to fix lockdep yet?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-07 Thread David Howells
Ingo Molnar [EMAIL PROTECTED] wrote:

  So, again, why _should_ I use the generic IRQ stuff? [...]
 
 To have shared code between architectures?

That's reasonable as far as it goes, the algorithms are similar per-arch, but
the PICs are quite ofter quite different.  My FRV board here has three very
different ones, none of them compatible with anything else as far as I know.

 To make generic API updates easier for all of us?

That's reasonable.

 To have less cruft in interrupt.h?

That's specious.  The whole point of having arch-specific code is to support
arch-specific stuff.

 To not having to add last-minute patches to v2.6.18 because some arch
 defines its own IRQ prototypes and a difficult generic feature like irqtrace
 breaks?

Specious again.  If whoever it was made the changes got them right in the
first place, then it wouldn't have required a last minute patch for the
LOCKDEP=n case now would it?

If you're going to insist on the genirq stuff being used, than you should take
CONFIG_GENERIC_HARDIRQS away and force everyone else to move to what you've
decided they should use, right?

 To get new IRQ subsystem features for free like preemptible irqs, irqpoll or
 SHIRQ debugging?

That's reasonable, but you don't get necessarily get features for free when
you add up the cost of having support there for them.  The features appear for
the subscribed arches automatically, and so do the costs.

 hm, could you take a look at why that difference happens? Do you make 
 use of __do_IRQ()?

I did say I used it.  In fact, as far as I can tell, I have to use it
recursively.  There doesn't seem to be any other way in that's correct.

 Do you make use of all the various flow handlers that are offered in
 handle.c?

Some of them.

 Could you #ifdef out all the functions that are unused? The kernel build
 process doesnt remove them and i havent (yet) put them into a library.

I could get away with commenting out:

no_action()
set_irq_wake()
can_request_irq()
set_irq_type()
set_irq_data()
set_irq_chip_data()
handle_simple_irq()
handle_fasteio_irq()
bits of handle_irq_name() corresponding to the previous two

This results in a small shrinkage of text and a slight increase in the amount
of data used:

   textdata bss dec hex filename
1993023   77908  166964 2237895  2225c7 vmlinux [before]
1991407   77912  166964 2236283  221f7b vmlinux [after]
---
   1616  -4   01612

The increase in data size is slightly puzzling, but may have something to do
with there being fewer strings in handle_irq_name().  The text decrease is
about 12% of the unmodified total:

   textdata bss dec hex filename
  109083272  12   141923770 kernel/irq/built-in.o
   1548  64   41616 650 arch/frv/kernel/irq.o
744 192   0 936 3a8 arch/frv/kernel/irq-mb93091.o
---
  132003528  16   16744 total

 the same why should we share code argument could be made for the VFS too.

That argument doesn't really follow.  We only have one interrupt system in the
kernel, but we have lots of different filesystems.

 Sharing code has a (small) price most of the time, but it's also very much
 worth it. I think the size increases you are seeing are artificial

Artificial in what manner?  I haven't added extra code to genirq to make it
look bad or anything like that.

 and most of it is not caused by the indirections. If they were caused by the
 indirections i'd probably agree with you.

I think most of the size increase is due to the core genirq function set being
large, not the indirections themselves.  There aren't many indirections
implemented in the core set.

The indirected functions exist in the arch code for the most part, and where
they are implemented they are generally very small.  In FRV's case, one lot in
arch/frv/kernel/irq.c for the CPU and one lot in arch/frv/kernel/irq-mb93091.c
or similar for the on-motherboard FPGA.

 if your argument were true every arch should run its whole Linux kernel 
 in arch/frv, with zero sharing with anyone else.

Not really.  eCos manages this more efficiently than Linux, with generally
fewer indirections through the use of macros and inline functions.

At some point you do have to draw a line and do common stuff.  The VFS is
definitely in the common region.  It has little need of arch-specific stuff in
there, and that that it does is quite readily encapsulated in inline functions
where it has little effect on the space.  I'm not entirely convinced that this
applies to interrupt handling though.  That is at the basic level very
arch-dependent.

 There's always a lot of 'unnecessary' stuff all around the kernel that is
 just a hindrance for FRV.

Or any other platform, embedded or otherwise, 

Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-07 Thread David Howells
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

 Well, genirq gives you more flexibility than the current mecanism so ...

No, it doesn't because the FRV arch contains its own mechanism and I can do
what ever I like in it.

genirq's flexibility comes at a price.  Count the number of hooks in struct
irq_chip and struct irq_desc together.

 If I understand correctly, you need to do scray stuff to figure out your
 toplevel irq, which shound't be a problem with either mecanisms... 

Yeah.  I can't actually find out what source caused top-level IRQs.  I have to
guess from looking at the IRQ priority and poking around in the hardware.  I
got bitten that way too: at one point, I was peeking at the interrupt flag in
the serial regs, only to realise this was causing the driver to go wrong
because it cleared the interrupt requested flag in UART.

Obviously I'd rather not use IRQ priorisation to help multiplex irqs, but
unless I want a large polling set...

 Now, if you have funky cascades, then you can always group them into a
 virtual irq cascade line and have a special chained flow handler that
 does all the figuring out off those... it's up to you. 

You make it sound so easy, but it's not obvious how to do this, apart from
installing interrupt handlers for the auxiliary PIC interrupts on the CPU and
having those call back into __do_IRQ().  Chaining isn't mentioned in
genericirq.tmpl.

 In general, I found genirq allowed me to do more fancy stuff, and end up
 with actually less hooks and indirect function calls on the path to a
 given irq than before as you can use tailored flow handlers that do just
 the right thing.

My code in the FRV arch has fewer indirection calls than the genirq code
simply because it doesn't require tables of operations.  I can trace through
it with gdb and see them.

I built all the stuff that genirq does in indirections directly into the
handlers.  It certainly has fewer hooks.

I attempted to convert it over to use genirq, and I came up with some numbers:

The difference in kernel sizes:

   textdata bss dec hex filename
1993023   77912  166964 2237899  2225cb vmlinux  [with genirq]
1986511   76016  167908 2230435  2208a3 vmlinux  [without genirq]

The genirq subdir all wraps up into this:

  109083272  12   141923770 kernel/irq/built-in.o
   1548  64   41616 650 arch/frv/kernel/irq.o
-
  124563336  16   158083dc0 total

My FRV-specific IRQ handling wraps up into these:

480 488   0 968 3c8 arch/frv/kernel/irq-mb93091.o
   4688  16 52052241468 arch/frv/kernel/irq.o
   15761152  162744 ab8 arch/frv/kernel/irq-routing.o
-
   67441656 536893622e8 total

There's a difference in BSS size in the main kernel that I can't account for,
but basically genirq uses 6.3KB more code and 1.8KB more initialised data, but
0.9KB less BSS.  Overall, about 7.2KB more memory.  I can shrink the BSS usage
in the FRV specific version by reducing the amount of space in the IRQ sources
table.

So, again, why _should_ I use the generic IRQ stuff?  It's bigger and very
probably slower than what I already have.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-06 Thread David Howells
Ingo Molnar [EMAIL PROTECTED] wrote:

 we'll get rid of that pt_regs thing centrally, from all drivers at once 
 - there's upstream buy-in for that already, and Thomas already generated 
 a test-patch for that a few months ago. But it's not a big issue right 
 now.

Yay!  Can you give me a pointer to the patch?

 this shouldnt be a big issue either - we use indirect jumps all around 
 the kernel.

Yes, I know.  I'm sometimes concerned at just how fast indirect jumps (and even
direct calls) are proliferating.  Look at the read syscall path for something
like ext3 these days: it's like a pile of spaghetti.  That seems particularly
true of direct-IO where it seems to weave in and out of core code and the
filesystem as it goes down.  I'm also concerned about stack usage.

 CPUs are either smart enough to predict it

I was told a while back (2002?) not to use indirect pointers for some stuff
because CPUs _couldn't_ predict it.  Maybe this has changed in modern CPUs.

   (3) ACK'ing and controlling interrupts has to be done by groups.
 
 please be more specific,

Under some circumstances I can work out which sources have triggered which
interrupts (there are various off-CPU FPGAs which implement auxiliary PICs that
do announce their sources), but the aux-PIC channels are grouped together upon
delivery to the CPU PIC, so some of the ACK'ing has to be done at the group
level.

 how is this not possible via genirq?

How is it possible with genirq?

Unless I tie all the grouped sources together into one virtual IRQ line, this
doesn't appear to be possible.  But doing that I might then also have a mixed
set of flow types in any particular IRQ.

   (4) No account is taken of interrupt priority.
 
 hm, i'm not sure what you mean - could you be more specific?

The FRV CPU, like many others, supports interrupt prioritisation.  A particular
interrupt level is set in the PSR, and any interrupt of a higher priority can
interrupt.  do_IRQ() can then do the interrupt processing in the interrupt
level of the interrupt that invoked it, thus permitting higher priority
interrupts to still happen.

 but ... somehow the current FRV code does figure out which IRQ source 
 fired, right?

Not always; sometimes it has to fall back to polling the drivers unfortunately.

Btw why are we using IRQ_INPROGRESS, IRQ_DISABLED, IRQ_PENDING and friends?
They would appear unnecessary.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-08 Thread David Howells
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

 No, you do a chain handler. Look at how I do it in
 arch/powerpc/platform/pseries/setup.c for example. It's actually
 trivial. You install a special flow handler (which means that there is
 very little overhead, almost none, from the toplevel irq to the chained
 irq). You can _also_ if you want just install an IRQ handler for the
 cascaded controller and call generic_handle_irq (rather than __do_IRQ)
 from it, but that has more overhead. A chained handler completely
 relaces the flow handler for the cascade, and thus, if you don't need
 all of the nits and bits of the other flow handlers for your cascade,
 you can speed things up by hooking at that level.

Please update Documentation/DocBook/genericirq.tmpl.  That doesn't mention it.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-08 Thread David Howells
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

  Please update Documentation/DocBook/genericirq.tmpl.  That doesn't mention 
  it.
 
 I must admit I haven't read the documentation :) I looked at the
 code/patches when genirq was posted and did my powerpc implementation
 based on my understanding of the code and discussions with Thomas and
 Ingo. I'll have a look at the doc next week and see if I can improve it.

While you're at it, you should also encomment pseries_8259_cascade() which is
what I suspect you're referring to in the powerpc sources.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-08 Thread David Howells
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

   Now, if you have funky cascades, then you can always group them into a
   virtual irq cascade line and have a special chained flow handler that
   does all the figuring out off those... it's up to you. 
  
  You make it sound so easy, but it's not obvious how to do this, apart from
  installing interrupt handlers for the auxiliary PIC interrupts on the CPU 
  and
  having those call back into __do_IRQ().  Chaining isn't mentioned in
  genericirq.tmpl.
 
 No, you do a chain handler. Look at how I do it in
 arch/powerpc/platform/pseries/setup.c for example. It's actually
 trivial. You install a special flow handler (which means that there is
 very little overhead, almost none, from the toplevel irq to the chained
 irq). You can _also_ if you want just install an IRQ handler for the
 cascaded controller and call generic_handle_irq (rather than __do_IRQ)
 from it, but that has more overhead. A chained handler completely
 relaces the flow handler for the cascade, and thus, if you don't need
 all of the nits and bits of the other flow handlers for your cascade,
 you can speed things up by hooking at that level.

But funky cascading using chained flow handlers doesn't work if the cascade
must share an IRQ with some other device, right?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-11 Thread David Howells
Ingo Molnar [EMAIL PROTECTED] wrote:

 i cannot find Thomas' recent 2.6 one (Thomas, do you have a link to 
 it?), but i did one 5 years ago:
 
  http://people.redhat.com/mingo/irq-rewrite-patches/irq-cleanup-2.4.15-B1.bz2
 
 in general it's a large but otherwise pretty dumb patch.

I wrote my own patch to test this last Friday.  I found that removing all the
regs pointer passing from the interrupt code reduced interrupt entry with a
warm cache by 1 cpu cycle out of 87, and interrupt exit by 19 cycles out of 99.

I can't tell from that exactly how many instructions/memory accesses have been
removed since the FRV permits two instructions to be executed in one cycle
under some circumstances, and two registers to be stored/loaded in one
instruction.

But the main gain in the exit path has to be due to recovery of the clobbered
regs parameter due to a call inside a loop, possibly in handle_IRQ_event().

I'd expect i386 to do better in cycle reduction because it has fewer registers
and so getting one back should gain more.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/59] Cleanup sysctl

2007-01-16 Thread David Howells

The FRV bits look okay.  I can't test them until I get back from Australia in
Feb.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 25/59] sysctl: C99 convert arch/frv/kernel/pm.c

2007-01-24 Thread David Howells

Fine by me.

Acked-By: David Howells [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] kill net/rxrpc/rxrpc_syms.c

2006-11-27 Thread David Howells
Adrian Bunk [EMAIL PROTECTED] wrote:

 This patch moves the EXPORT_SYMBOL's from net/rxrpc/rxrpc_syms.c to the 
 files with the actual functions.

You can if you like.  Can you slap a blank line before each EXPORT_SYMBOL()
though please?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] kill net/rxrpc/rxrpc_syms.c

2006-11-28 Thread David Howells
Adrian Bunk [EMAIL PROTECTED] wrote:

   This patch moves the EXPORT_SYMBOL's from net/rxrpc/rxrpc_syms.c to the 
   files with the actual functions.
  
  You can if you like.  Can you slap a blank line before each EXPORT_SYMBOL()
  though please?
 
 Updated patch below.

Acked-By: David Howells [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/9] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code

2007-04-02 Thread David Howells
Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can
use it too.

The kdoc comments I've attached to the functions needs to be checked by whoever
wrote them as I had to make some guesses about the workings of these functions.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/skbuff.h |6 ++
 include/net/esp.h  |2 -
 net/core/skbuff.c  |  188 
 net/xfrm/xfrm_algo.c   |  169 ---
 4 files changed, 194 insertions(+), 171 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 82f43ad..d53ff7c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -83,6 +83,7 @@
  */
 
 struct net_device;
+struct scatterlist;
 
 #ifdef CONFIG_NETFILTER
 struct nf_conntrack {
@@ -364,6 +365,11 @@ extern struct sk_buff *skb_realloc_headroom(struct sk_buff 
*skb,
 extern struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
   int newheadroom, int newtailroom,
   gfp_t priority);
+extern intskb_to_sgvec(struct sk_buff *skb,
+   struct scatterlist *sg, int offset,
+   int len);
+extern intskb_cow_data(struct sk_buff *skb, int tailbits,
+   struct sk_buff **trailer);
 extern intskb_pad(struct sk_buff *skb, int pad);
 #define dev_kfree_skb(a)   kfree_skb(a)
 extern void  skb_over_panic(struct sk_buff *skb, int len,
diff --git a/include/net/esp.h b/include/net/esp.h
index 713d039..d05d8d2 100644
--- a/include/net/esp.h
+++ b/include/net/esp.h
@@ -40,8 +40,6 @@ struct esp_data
} auth;
 };
 
-extern int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int 
offset, int len);
-extern int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff 
**trailer);
 extern void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len);
 
 static inline int esp_mac_digest(struct esp_data *esp, struct sk_buff *skb,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 87573ae..156b9c0 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -55,6 +55,7 @@
 #include linux/cache.h
 #include linux/rtnetlink.h
 #include linux/init.h
+#include linux/scatterlist.h
 
 #include net/protocol.h
 #include net/dst.h
@@ -2060,6 +2061,190 @@ void __init skb_init(void)
NULL, NULL);
 }
 
+/**
+ * skb_to_sgvec - Fill a scatter-gather list from a socket buffer
+ * @skb: Socket buffer containing the buffers to be mapped
+ * @sg: The scatter-gather list to map into
+ * @offset: The offset into the buffer's contents to start mapping
+ * @len: Length of buffer space to be mapped
+ *
+ * Fill the specified scatter-gather list with mappings/pointers into a
+ * region of the buffer space attached to a socket buffer.
+ */
+int
+skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
+{
+   int start = skb_headlen(skb);
+   int i, copy = start - offset;
+   int elt = 0;
+
+   if (copy  0) {
+   if (copy  len)
+   copy = len;
+   sg[elt].page = virt_to_page(skb-data + offset);
+   sg[elt].offset = (unsigned long)(skb-data + offset) % 
PAGE_SIZE;
+   sg[elt].length = copy;
+   elt++;
+   if ((len -= copy) == 0)
+   return elt;
+   offset += copy;
+   }
+
+   for (i = 0; i  skb_shinfo(skb)-nr_frags; i++) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + skb_shinfo(skb)-frags[i].size;
+   if ((copy = end - offset)  0) {
+   skb_frag_t *frag = skb_shinfo(skb)-frags[i];
+
+   if (copy  len)
+   copy = len;
+   sg[elt].page = frag-page;
+   sg[elt].offset = frag-page_offset+offset-start;
+   sg[elt].length = copy;
+   elt++;
+   if (!(len -= copy))
+   return elt;
+   offset += copy;
+   }
+   start = end;
+   }
+
+   if (skb_shinfo(skb)-frag_list) {
+   struct sk_buff *list = skb_shinfo(skb)-frag_list;
+
+   for (; list; list = list-next) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + list-len;
+   if ((copy = end - offset)  0) {
+   if (copy  len)
+   copy = len;
+   elt += skb_to_sgvec(list, sg+elt, offset - 
start, copy);
+   if ((len -= copy) == 0

[PATCH 1/9] AF_RXRPC: Add blkcipher accessors for using kernel data directly

2007-04-02 Thread David Howells
Add blkcipher accessors for using kernel data directly without the use of
scatter lists.

Also add a CRYPTO_ALG_DMA algorithm capability flag to permit or deny the use
of DMA and hardware accelerators.  A hardware accelerator may not be used to
access any arbitrary piece of kernel memory lest it not be in a DMA'able
region.  Only software algorithms may do that.

If kernel data is going to be accessed directly, then CRYPTO_ALG_DMA must, for
instance, be passed in the mask of crypto_alloc_blkcipher(), but not the type.

This is used by AF_RXRPC to do quick encryptions, where the size of the data
being encrypted or decrypted is 8 bytes or, occasionally, 16 bytes (ie: one or
two chunks only), and since these data are generally on the stack they may be
split over two pages.  Because they're so small, and because they may be
misaligned, setting up a scatter-gather list is overly expensive.  It is very
unlikely that a hardware FCrypt PCBC engine will be encountered (there is not,
as far as I know, any such thing), and even if one is encountered, the
setup/teardown costs for such small transactions will almost certainly be
prohibitive.

Encrypting and decrypting whole packets, on the other hand, is done through the
scatter-gather list interface as the amount of data is sufficient that the
expense of doing virtual address to page calculations is sufficiently small by
comparison.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 crypto/blkcipher.c |2 +
 crypto/pcbc.c  |   62 +
 include/linux/crypto.h |  118 
 3 files changed, 181 insertions(+), 1 deletions(-)

diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c
index b5befe8..4498b2d 100644
--- a/crypto/blkcipher.c
+++ b/crypto/blkcipher.c
@@ -376,6 +376,8 @@ static int crypto_init_blkcipher_ops(struct crypto_tfm 
*tfm, u32 type, u32 mask)
crt-setkey = setkey;
crt-encrypt = alg-encrypt;
crt-decrypt = alg-decrypt;
+   crt-encrypt_kernel = alg-encrypt_kernel;
+   crt-decrypt_kernel = alg-decrypt_kernel;
 
addr = (unsigned long)crypto_tfm_ctx(tfm);
addr = ALIGN(addr, align);
diff --git a/crypto/pcbc.c b/crypto/pcbc.c
index 5174d7f..fa76111 100644
--- a/crypto/pcbc.c
+++ b/crypto/pcbc.c
@@ -126,6 +126,36 @@ static int crypto_pcbc_encrypt(struct blkcipher_desc *desc,
return err;
 }
 
+static int crypto_pcbc_encrypt_kernel(struct blkcipher_desc *desc,
+ u8 *dst, const u8 *src,
+ unsigned int nbytes)
+{
+   struct blkcipher_walk walk;
+   struct crypto_blkcipher *tfm = desc-tfm;
+   struct crypto_pcbc_ctx *ctx = crypto_blkcipher_ctx(tfm);
+   struct crypto_cipher *child = ctx-child;
+   void (*xor)(u8 *, const u8 *, unsigned int bs) = ctx-xor;
+
+   BUG_ON(crypto_tfm_alg_capabilities(crypto_cipher_tfm(child)) 
+  CRYPTO_ALG_DMA);
+
+   if (nbytes == 0)
+   return 0;
+
+   memset(walk, 0, sizeof(walk));
+   walk.src.virt.addr = (u8 *) src;
+   walk.dst.virt.addr = (u8 *) dst;
+   walk.nbytes = nbytes;
+   walk.total = nbytes;
+   walk.iv = desc-info;
+
+   if (walk.src.virt.addr == walk.dst.virt.addr)
+   nbytes = crypto_pcbc_encrypt_inplace(desc, walk, child, xor);
+   else
+   nbytes = crypto_pcbc_encrypt_segment(desc, walk, child, xor);
+   return 0;
+}
+
 static int crypto_pcbc_decrypt_segment(struct blkcipher_desc *desc,
   struct blkcipher_walk *walk,
   struct crypto_cipher *tfm,
@@ -211,6 +241,36 @@ static int crypto_pcbc_decrypt(struct blkcipher_desc *desc,
return err;
 }
 
+static int crypto_pcbc_decrypt_kernel(struct blkcipher_desc *desc,
+ u8 *dst, const u8 *src,
+ unsigned int nbytes)
+{
+   struct blkcipher_walk walk;
+   struct crypto_blkcipher *tfm = desc-tfm;
+   struct crypto_pcbc_ctx *ctx = crypto_blkcipher_ctx(tfm);
+   struct crypto_cipher *child = ctx-child;
+   void (*xor)(u8 *, const u8 *, unsigned int bs) = ctx-xor;
+
+   BUG_ON(crypto_tfm_alg_capabilities(crypto_cipher_tfm(child)) 
+   CRYPTO_ALG_DMA);
+
+   if (nbytes == 0)
+   return 0;
+
+   memset(walk, 0, sizeof(walk));
+   walk.src.virt.addr = (u8 *) src;
+   walk.dst.virt.addr = (u8 *) dst;
+   walk.nbytes = nbytes;
+   walk.total = nbytes;
+   walk.iv = desc-info;
+
+   if (walk.src.virt.addr == walk.dst.virt.addr)
+   nbytes = crypto_pcbc_decrypt_inplace(desc, walk, child, xor);
+   else
+   nbytes = crypto_pcbc_decrypt_segment(desc, walk, child, xor);
+   return 0;
+}
+
 static void xor_byte(u8 *a, const u8 *b, unsigned int bs)
 {
do {
@@ -313,6 +373,8 @@ static struct crypto_instance

[PATCH 3/9] AF_RXRPC: Make it possible to merely try to cancel timers and delayed work

2007-04-02 Thread David Howells
Export try_to_del_timer_sync() for use by the RxRPC module.

Add a try_to_cancel_delayed_work() so that it is possible to merely attempt to
cancel a delayed work timer.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/workqueue.h |   21 +
 kernel/timer.c|2 ++
 2 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2a7b38d..40a61ae 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -204,4 +204,25 @@ static inline int cancel_delayed_work(struct delayed_work 
*work)
return ret;
 }
 
+/**
+ * try_to_cancel_delayed_work - Try to kill pending scheduled, delayed work
+ * @work: the work to cancel
+ *
+ * Try to kill off a pending schedule_delayed_work().
+ * - The timer may still be running afterwards, and if so, the work may still
+ *   be pending
+ * - Returns -1 if timer still active, 1 if timer removed, 0 if not scheduled
+ * - Can be called from the work routine; if it's still pending, just return
+ *   and it'll be called again.
+ */
+static inline int try_to_cancel_delayed_work(struct delayed_work *work)
+{
+   int ret;
+
+   ret = try_to_del_timer_sync(work-timer);
+   if (ret  0)
+   work_release(work-work);
+   return ret;
+}
+
 #endif
diff --git a/kernel/timer.c b/kernel/timer.c
index 440048a..ba4d6e0 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -505,6 +505,8 @@ out:
return ret;
 }
 
+EXPORT_SYMBOL(try_to_del_timer_sync);
+
 /**
  * del_timer_sync - deactivate a timer and wait for the handler to finish.
  * @timer: the timer to be deactivated

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/9] AF_RXRPC: Key facility changes for AF_RXRPC

2007-04-02 Thread David Howells
Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/keys.txt  |   12 
 include/linux/key.h |2 ++
 security/keys/keyring.c |2 ++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d..81d9aa0 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents for more information.
void unregister_key_type(struct key_type *type);
 
 
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys.  The facility provides access to the keyring type for managing
+such a bundle:
+
+   struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings.  A keyring thus found can then be searched
+with keyring_search().  Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
 ===
 NOTES ON ACCESSING PAYLOAD CONTENTS
 ===
diff --git a/include/linux/key.h b/include/linux/key.h
index 169f05e..a9220e7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -160,6 +160,8 @@ struct key {
 */
union {
struct list_headlink;
+   unsigned long   x[2];
+   void*p[2];
} type_data;
 
/* key data
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index ad45ce7..88292e3 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -66,6 +66,8 @@ struct key_type key_type_keyring = {
.read   = keyring_read,
 };
 
+EXPORT_SYMBOL(key_type_keyring);
+
 /*
  * semaphore to serialise link/link calls to prevent two link calls in parallel
  * introducing a cycle

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/9] AF_RXRPC: Add an interface to the AF_RXRPC module for the AFS filesystem to use

2007-04-02 Thread David Howells
,
unsigned long user_call_ID,
struct sk_buff *skb);

void
rxrpc_kernel_intercept_rx_messages(struct socket *sock,
   rxrpc_interceptor_t interceptor);

 This installs an interceptor function on the specified AF_RXRPC socket.
 All messages that would otherwise wind up in the socket's Rx queue are
 then diverted to this function.  Note that care must be taken to process
 the messages in the right order to maintain DATA message sequentiality.

 The interceptor function itself is provided with the address of the socket
 and handling the incoming message, the ID assigned by the kernel utility
 to the call and the socket buffer containing the message.

 The skb-mark field indicates the type of message:

MARKMEANING
=== ===
RXRPC_SKB_MARK_DATA Data message
RXRPC_SKB_MARK_FINAL_ACKFinal ACK received for an incoming call
RXRPC_SKB_MARK_BUSY Client call rejected as server busy
RXRPC_SKB_MARK_REMOTE_ABORT Call aborted by peer
RXRPC_SKB_MARK_NET_ERRORNetwork error detected
RXRPC_SKB_MARK_LOCAL_ERROR  Local error encountered
RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance

 The remote abort message can be probed with rxrpc_kernel_get_abort_code().
 The two error messages can be probed with rxrpc_kernel_get_error_number().
 A new call can be accepted with rxrpc_kernel_accept_call().

 Data messages can have their contents extracted with the usual bunch of
 socket buffer manipulation functions.  A data message can be determined to
 be the last one in a sequence with rxrpc_kernel_is_data_last().  When a
 data message has been used up, rxrpc_kernel_data_delivered() should be
 called on it..

 Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
 of.  It is possible to get extra refs on all types of message for later
 freeing, but this may pin the state of a call until the message is finally
 freed.

 (*) Accept an incoming call.

struct rxrpc_call *
rxrpc_kernel_accept_call(struct socket *sock,
 unsigned long user_call_ID);

 This is used to accept an incoming call and to assign it a call ID.  This
 function is similar to rxrpc_kernel_begin_call() and calls accepted must
 be ended in the same way.

 If this function is successful, an opaque reference to the RxRPC call is
 returned.  The caller now holds a reference on this and it must be
 properly ended.

 (*) Reject an incoming call.

int rxrpc_kernel_reject_call(struct socket *sock);

 This is used to reject the first incoming call on the socket's queue with
 a BUSY message.  -ENODATA is returned if there were no incoming calls.
 Other errors may be returned if the call had been aborted (-ECONNABORTED)
 or had timed out (-ETIME).

 (*) Record the delivery of a data message and free it.

void rxrpc_kernel_data_delivered(struct sk_buff *skb);

 This is used to record a data message as having been delivered and to
 update the ACK state for the call.  The socket buffer will be freed.

 (*) Free a message.

void rxrpc_kernel_free_skb(struct sk_buff *skb);

 This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
 socket.

 (*) Determine if a data message is the last one on a call.

bool rxrpc_kernel_is_data_last(struct sk_buff *skb);

 This is used to determine if a socket buffer holds the last data message
 to be received for a call (true will be returned if it does, false
 if not).

 The data message will be part of the reply on a client call and the
 request on an incoming call.  In the latter case there will be more
 messages, but in the former case there will not.

 (*) Get the abort code from an abort message.

u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);

 This is used to extract the abort code from a remote abort message.

 (*) Get the error number from a local or network error message.

int rxrpc_kernel_get_error_number(struct sk_buff *skb);

 This is used to extract the error number from a message indicating either
 a local error occurred or a network error occurred.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/networking/rxrpc.txt |  196 
 include/net/af_rxrpc.h |   44 
 include/rxrpc/packet.h |   12 ++
 net/rxrpc/af_rxrpc.c   |  122 +-
 net/rxrpc/ar-accept.c  |  111 
 net/rxrpc/ar-connection.c  |   24 +++-
 net/rxrpc/ar-input.c

Re: [PATCH 1/9] AF_RXRPC: Add blkcipher accessors for using kernel data directly

2007-04-03 Thread David Howells
Herbert Xu [EMAIL PROTECTED] wrote:

 Would it be possible to just use the existing scatterlist interface
 for now? We can simplify it later when things settle down.

I'll apply the attached patch for now and drop the bypass patch.  It's a bit
messy, but it does let me use the sg-list interface.

Note that it does paste stack space into sg-list elements, which I think
should be okay, and it asks the compiler to get it appropriately aligned bits
of stack.

David

diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
index b3bd399..1eaf529 100644
--- a/net/rxrpc/rxkad.c
+++ b/net/rxrpc/rxkad.c
@@ -111,8 +111,11 @@ static void rxkad_prime_packet_security(struct 
rxrpc_connection *conn)
 {
struct rxrpc_key_payload *payload;
struct blkcipher_desc desc;
+   struct scatterlist sg[2];
struct rxrpc_crypt iv;
-   __be32 tmpbuf[4];
+   struct {
+   __be32 x[4];
+   } tmpbuf __attribute__((aligned(16))); /* must all be in same page */
 
_enter();
 
@@ -126,16 +129,18 @@ static void rxkad_prime_packet_security(struct 
rxrpc_connection *conn)
desc.info = iv.x;
desc.flags = 0;
 
-   tmpbuf[0] = conn-epoch;
-   tmpbuf[1] = conn-cid;
-   tmpbuf[2] = 0;
-   tmpbuf[3] = htonl(conn-security_ix);
+   tmpbuf.x[0] = conn-epoch;
+   tmpbuf.x[1] = conn-cid;
+   tmpbuf.x[2] = 0;
+   tmpbuf.x[3] = htonl(conn-security_ix);
 
-   crypto_blkcipher_encrypt_kernel_iv(desc, (void *) tmpbuf,
-  (void *) tmpbuf, sizeof(tmpbuf));
+   memset(sg, 0, sizeof(sg));
+   sg_set_buf(sg[0], tmpbuf, sizeof(tmpbuf));
+   sg_set_buf(sg[1], tmpbuf, sizeof(tmpbuf));
+   crypto_blkcipher_encrypt_iv(desc, sg[0], sg[1], sizeof(tmpbuf));
 
-   memcpy(conn-csum_iv, tmpbuf[2], sizeof(conn-csum_iv));
-   ASSERTCMP(conn-csum_iv.n[0], ==, tmpbuf[2]);
+   memcpy(conn-csum_iv, tmpbuf.x[2], sizeof(conn-csum_iv));
+   ASSERTCMP(conn-csum_iv.n[0], ==, tmpbuf.x[2]);
 
_leave();
 }
@@ -151,10 +156,11 @@ static int rxkad_secure_packet_auth(const struct 
rxrpc_call *call,
struct rxrpc_skb_priv *sp;
struct blkcipher_desc desc;
struct rxrpc_crypt iv;
+   struct scatterlist sg[2];
struct {
struct rxkad_level1_hdr hdr;
__be32  first;  /* first four bytes of data and padding */
-   } buf;
+   } tmpbuf __attribute__((aligned(8))); /* must all be in same page */
u16 check;
 
sp = rxrpc_skb(skb);
@@ -164,8 +170,8 @@ static int rxkad_secure_packet_auth(const struct rxrpc_call 
*call,
check = ntohl(sp-hdr.seq ^ sp-hdr.callNumber);
data_size |= (u32) check  16;
 
-   buf.hdr.data_size = htonl(data_size);
-   memcpy(buf.first, sechdr + 4, sizeof(buf.first));
+   tmpbuf.hdr.data_size = htonl(data_size);
+   memcpy(tmpbuf.first, sechdr + 4, sizeof(tmpbuf.first));
 
/* start the encryption afresh */
memset(iv, 0, sizeof(iv));
@@ -173,10 +179,12 @@ static int rxkad_secure_packet_auth(const struct 
rxrpc_call *call,
desc.info = iv.x;
desc.flags = 0;
 
-   crypto_blkcipher_encrypt_kernel_iv(desc, (void *) buf,
-  (void *) buf, sizeof(buf));
+   memset(sg, 0, sizeof(sg));
+   sg_set_buf(sg[0], tmpbuf, sizeof(tmpbuf));
+   sg_set_buf(sg[1], tmpbuf, sizeof(tmpbuf));
+   crypto_blkcipher_encrypt_iv(desc, sg[0], sg[1], sizeof(tmpbuf));
 
-   memcpy(sechdr, buf, sizeof(buf));
+   memcpy(sechdr, tmpbuf, sizeof(tmpbuf));
 
_leave( = 0);
return 0;
@@ -191,7 +199,8 @@ static int rxkad_secure_packet_encrypt(const struct 
rxrpc_call *call,
void *sechdr)
 {
const struct rxrpc_key_payload *payload;
-   struct rxkad_level2_hdr rxkhdr;
+   struct rxkad_level2_hdr rxkhdr
+   __attribute__((aligned(8))); /* must be all on one page */
struct rxrpc_skb_priv *sp;
struct blkcipher_desc desc;
struct rxrpc_crypt iv;
@@ -217,8 +226,10 @@ static int rxkad_secure_packet_encrypt(const struct 
rxrpc_call *call,
desc.info = iv.x;
desc.flags = 0;
 
-   crypto_blkcipher_encrypt_kernel_iv(desc, (void *) sechdr,
-  (void *) rxkhdr, sizeof(rxkhdr));
+   memset(sg, 0, sizeof(sg[0]) * 2);
+   sg_set_buf(sg[0], sechdr, sizeof(rxkhdr));
+   sg_set_buf(sg[1], rxkhdr, sizeof(rxkhdr));
+   crypto_blkcipher_encrypt_iv(desc, sg[0], sg[1], sizeof(rxkhdr));
 
/* we want to encrypt the skbuff in-place */
nsg = skb_cow_data(skb, 0, trailer);
@@ -246,7 +257,11 @@ static int rxkad_secure_packet(const struct rxrpc_call 
*call,
struct rxrpc_skb_priv *sp;
struct blkcipher_desc desc;
struct rxrpc_crypt iv;
-   __be32 tmpbuf[2], x;
+   struct scatterlist sg[2];
+   struct {
+   __be32 x[2];
+

[PATCH 1/8] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code [try #2]

2007-04-03 Thread David Howells
Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can
use it too.

The kdoc comments I've attached to the functions needs to be checked by whoever
wrote them as I had to make some guesses about the workings of these functions.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/skbuff.h |6 ++
 include/net/esp.h  |2 -
 net/core/skbuff.c  |  188 
 net/xfrm/xfrm_algo.c   |  169 ---
 4 files changed, 194 insertions(+), 171 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 82f43ad..d53ff7c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -83,6 +83,7 @@
  */
 
 struct net_device;
+struct scatterlist;
 
 #ifdef CONFIG_NETFILTER
 struct nf_conntrack {
@@ -364,6 +365,11 @@ extern struct sk_buff *skb_realloc_headroom(struct sk_buff 
*skb,
 extern struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
   int newheadroom, int newtailroom,
   gfp_t priority);
+extern intskb_to_sgvec(struct sk_buff *skb,
+   struct scatterlist *sg, int offset,
+   int len);
+extern intskb_cow_data(struct sk_buff *skb, int tailbits,
+   struct sk_buff **trailer);
 extern intskb_pad(struct sk_buff *skb, int pad);
 #define dev_kfree_skb(a)   kfree_skb(a)
 extern void  skb_over_panic(struct sk_buff *skb, int len,
diff --git a/include/net/esp.h b/include/net/esp.h
index 713d039..d05d8d2 100644
--- a/include/net/esp.h
+++ b/include/net/esp.h
@@ -40,8 +40,6 @@ struct esp_data
} auth;
 };
 
-extern int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int 
offset, int len);
-extern int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff 
**trailer);
 extern void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len);
 
 static inline int esp_mac_digest(struct esp_data *esp, struct sk_buff *skb,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 87573ae..156b9c0 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -55,6 +55,7 @@
 #include linux/cache.h
 #include linux/rtnetlink.h
 #include linux/init.h
+#include linux/scatterlist.h
 
 #include net/protocol.h
 #include net/dst.h
@@ -2060,6 +2061,190 @@ void __init skb_init(void)
NULL, NULL);
 }
 
+/**
+ * skb_to_sgvec - Fill a scatter-gather list from a socket buffer
+ * @skb: Socket buffer containing the buffers to be mapped
+ * @sg: The scatter-gather list to map into
+ * @offset: The offset into the buffer's contents to start mapping
+ * @len: Length of buffer space to be mapped
+ *
+ * Fill the specified scatter-gather list with mappings/pointers into a
+ * region of the buffer space attached to a socket buffer.
+ */
+int
+skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
+{
+   int start = skb_headlen(skb);
+   int i, copy = start - offset;
+   int elt = 0;
+
+   if (copy  0) {
+   if (copy  len)
+   copy = len;
+   sg[elt].page = virt_to_page(skb-data + offset);
+   sg[elt].offset = (unsigned long)(skb-data + offset) % 
PAGE_SIZE;
+   sg[elt].length = copy;
+   elt++;
+   if ((len -= copy) == 0)
+   return elt;
+   offset += copy;
+   }
+
+   for (i = 0; i  skb_shinfo(skb)-nr_frags; i++) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + skb_shinfo(skb)-frags[i].size;
+   if ((copy = end - offset)  0) {
+   skb_frag_t *frag = skb_shinfo(skb)-frags[i];
+
+   if (copy  len)
+   copy = len;
+   sg[elt].page = frag-page;
+   sg[elt].offset = frag-page_offset+offset-start;
+   sg[elt].length = copy;
+   elt++;
+   if (!(len -= copy))
+   return elt;
+   offset += copy;
+   }
+   start = end;
+   }
+
+   if (skb_shinfo(skb)-frag_list) {
+   struct sk_buff *list = skb_shinfo(skb)-frag_list;
+
+   for (; list; list = list-next) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + list-len;
+   if ((copy = end - offset)  0) {
+   if (copy  len)
+   copy = len;
+   elt += skb_to_sgvec(list, sg+elt, offset - 
start, copy);
+   if ((len -= copy) == 0

[PATCH 2/8] AF_RXRPC: Make it possible to merely try to cancel timers and delayed work [try #2]

2007-04-03 Thread David Howells
Export try_to_del_timer_sync() for use by the RxRPC module.

Add a try_to_cancel_delayed_work() so that it is possible to merely attempt to
cancel a delayed work timer.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/workqueue.h |   21 +
 kernel/timer.c|2 ++
 2 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2a7b38d..40a61ae 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -204,4 +204,25 @@ static inline int cancel_delayed_work(struct delayed_work 
*work)
return ret;
 }
 
+/**
+ * try_to_cancel_delayed_work - Try to kill pending scheduled, delayed work
+ * @work: the work to cancel
+ *
+ * Try to kill off a pending schedule_delayed_work().
+ * - The timer may still be running afterwards, and if so, the work may still
+ *   be pending
+ * - Returns -1 if timer still active, 1 if timer removed, 0 if not scheduled
+ * - Can be called from the work routine; if it's still pending, just return
+ *   and it'll be called again.
+ */
+static inline int try_to_cancel_delayed_work(struct delayed_work *work)
+{
+   int ret;
+
+   ret = try_to_del_timer_sync(work-timer);
+   if (ret  0)
+   work_release(work-work);
+   return ret;
+}
+
 #endif
diff --git a/kernel/timer.c b/kernel/timer.c
index 440048a..ba4d6e0 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -505,6 +505,8 @@ out:
return ret;
 }
 
+EXPORT_SYMBOL(try_to_del_timer_sync);
+
 /**
  * del_timer_sync - deactivate a timer and wait for the handler to finish.
  * @timer: the timer to be deactivated

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/8] AF_RXRPC socket family and AFS rewrite [try #2]

2007-04-03 Thread David Howells

The first of these patches together provide secure client-side RxRPC
connectivity as a Linux kernel socket family.  Only the RxRPC transport/session
side is supplied - the presentation side (marshalling the data) is left to the
client.  Copies of the patches can be found here:

http://people.redhat.com/~dhowells/rxrpc/series
http://people.redhat.com/~dhowells/rxrpc/01-move-skb-generic.diff
http://people.redhat.com/~dhowells/rxrpc/02-timers.diff
http://people.redhat.com/~dhowells/rxrpc/03-keys.diff
http://people.redhat.com/~dhowells/rxrpc/04-af_rxrpc.diff

Further patches make the in-kernel AFS filesystem use AF_RXRPC and delete the
old RxRPC implementation:

http://people.redhat.com/~dhowells/rxrpc/05-afs-cleanup.diff
http://people.redhat.com/~dhowells/rxrpc/06-af_rxrpc-kernel.diff
http://people.redhat.com/~dhowells/rxrpc/07-af_rxrpc-afs.diff
http://people.redhat.com/~dhowells/rxrpc/08-af_rxrpc-delete-old.diff

The userspace access methods make use of the control data passed to/by
sendmsg() and recvmsg().  See the three simple test programs:

http://people.redhat.com/~dhowells/rxrpc/klog.c
http://people.redhat.com/~dhowells/rxrpc/rxrpc.c
http://people.redhat.com/~dhowells/rxrpc/listen.c

TODO:

 (*) Make certain parameters (such as connection timeouts) userspace
 configurable.

 (*) Make userspace utilities use it; librxrpc.

 (*) Userspace documentation.

 (*) KerberosV security.

Changes:

 (*) SOCK_RPC has been removed.  SOCK_DGRAM is now used instead.

 (*) I've add a facility whereby calls can be made to destinations other than
 the connect() address of a client socket by making use of msg_name in the
 msghdr struct when using sendmsg() to send the first data packet of a
 call.  Indeed, a client socket need not be connected before being used
 so.

 (*) I've also added a facility whereby client calls may also be made on
 server sockets, again by using msg_name in the msghdr struct.  In such a
 case, the server's local transport endpoint is used.

 (*) I've made the write buffer space check available to various callers
 (sk_write_space) and implemented poll support.

 (*) Rewrote rxrpc_recvmsg().  It now concatenates adjacent data messages from
 the same call when delivering them.

 (*) Updated the documentation to include notes on recvmsg, cover control
 messages and cover SOL_RXRPC-level socket options.

 (*) Provided an in-kernel interface to give in-kernel utilities easier access
 to the facility.

 (*) Made fs/afs/ use it.

 (*) Deleted the old contents of net/rxrpc/.

 (*) Use the scatterlist interface to the crypto API for now.  The patch that
 added the direct access interface conflicts with patches Herbert Xu is
 producing, so I've dropped it for the moment.

 (*) Moved a bug fix to make secure connection reuse from the af_rxrpc-kernel
 patch to the af_rxrpc main patch.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] AF_RXRPC: Add an interface to the AF_RXRPC module for the AFS filesystem to use [try #2]

2007-04-03 Thread David Howells
,
unsigned long user_call_ID,
struct sk_buff *skb);

void
rxrpc_kernel_intercept_rx_messages(struct socket *sock,
   rxrpc_interceptor_t interceptor);

 This installs an interceptor function on the specified AF_RXRPC socket.
 All messages that would otherwise wind up in the socket's Rx queue are
 then diverted to this function.  Note that care must be taken to process
 the messages in the right order to maintain DATA message sequentiality.

 The interceptor function itself is provided with the address of the socket
 and handling the incoming message, the ID assigned by the kernel utility
 to the call and the socket buffer containing the message.

 The skb-mark field indicates the type of message:

MARKMEANING
=== ===
RXRPC_SKB_MARK_DATA Data message
RXRPC_SKB_MARK_FINAL_ACKFinal ACK received for an incoming call
RXRPC_SKB_MARK_BUSY Client call rejected as server busy
RXRPC_SKB_MARK_REMOTE_ABORT Call aborted by peer
RXRPC_SKB_MARK_NET_ERRORNetwork error detected
RXRPC_SKB_MARK_LOCAL_ERROR  Local error encountered
RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance

 The remote abort message can be probed with rxrpc_kernel_get_abort_code().
 The two error messages can be probed with rxrpc_kernel_get_error_number().
 A new call can be accepted with rxrpc_kernel_accept_call().

 Data messages can have their contents extracted with the usual bunch of
 socket buffer manipulation functions.  A data message can be determined to
 be the last one in a sequence with rxrpc_kernel_is_data_last().  When a
 data message has been used up, rxrpc_kernel_data_delivered() should be
 called on it..

 Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
 of.  It is possible to get extra refs on all types of message for later
 freeing, but this may pin the state of a call until the message is finally
 freed.

 (*) Accept an incoming call.

struct rxrpc_call *
rxrpc_kernel_accept_call(struct socket *sock,
 unsigned long user_call_ID);

 This is used to accept an incoming call and to assign it a call ID.  This
 function is similar to rxrpc_kernel_begin_call() and calls accepted must
 be ended in the same way.

 If this function is successful, an opaque reference to the RxRPC call is
 returned.  The caller now holds a reference on this and it must be
 properly ended.

 (*) Reject an incoming call.

int rxrpc_kernel_reject_call(struct socket *sock);

 This is used to reject the first incoming call on the socket's queue with
 a BUSY message.  -ENODATA is returned if there were no incoming calls.
 Other errors may be returned if the call had been aborted (-ECONNABORTED)
 or had timed out (-ETIME).

 (*) Record the delivery of a data message and free it.

void rxrpc_kernel_data_delivered(struct sk_buff *skb);

 This is used to record a data message as having been delivered and to
 update the ACK state for the call.  The socket buffer will be freed.

 (*) Free a message.

void rxrpc_kernel_free_skb(struct sk_buff *skb);

 This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
 socket.

 (*) Determine if a data message is the last one on a call.

bool rxrpc_kernel_is_data_last(struct sk_buff *skb);

 This is used to determine if a socket buffer holds the last data message
 to be received for a call (true will be returned if it does, false
 if not).

 The data message will be part of the reply on a client call and the
 request on an incoming call.  In the latter case there will be more
 messages, but in the former case there will not.

 (*) Get the abort code from an abort message.

u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);

 This is used to extract the abort code from a remote abort message.

 (*) Get the error number from a local or network error message.

int rxrpc_kernel_get_error_number(struct sk_buff *skb);

 This is used to extract the error number from a message indicating either
 a local error occurred or a network error occurred.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/networking/rxrpc.txt |  196 
 include/net/af_rxrpc.h |   44 
 include/rxrpc/packet.h |   12 ++
 net/rxrpc/af_rxrpc.c   |  122 +-
 net/rxrpc/ar-accept.c  |  111 
 net/rxrpc/ar-input.c   |   36 ---
 net/rxrpc/ar-internal.h

[PATCH 3/8] AF_RXRPC: Key facility changes for AF_RXRPC [try #2]

2007-04-03 Thread David Howells
Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/keys.txt  |   12 
 include/linux/key.h |2 ++
 security/keys/keyring.c |2 ++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d..81d9aa0 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents for more information.
void unregister_key_type(struct key_type *type);
 
 
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys.  The facility provides access to the keyring type for managing
+such a bundle:
+
+   struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings.  A keyring thus found can then be searched
+with keyring_search().  Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
 ===
 NOTES ON ACCESSING PAYLOAD CONTENTS
 ===
diff --git a/include/linux/key.h b/include/linux/key.h
index 169f05e..a9220e7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -160,6 +160,8 @@ struct key {
 */
union {
struct list_headlink;
+   unsigned long   x[2];
+   void*p[2];
} type_data;
 
/* key data
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index ad45ce7..88292e3 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -66,6 +66,8 @@ struct key_type key_type_keyring = {
.read   = keyring_read,
 };
 
+EXPORT_SYMBOL(key_type_keyring);
+
 /*
  * semaphore to serialise link/link calls to prevent two link calls in parallel
  * introducing a cycle

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-10 Thread David Howells
Robin Getz [EMAIL PROTECTED] wrote:

 David - I know you have been reworking the noMMU vma handling - is there a 
 solution to vm_insert_page?

The reason vm_insert_page() is being called, I imagine, is because
packet_mmap() has to insert mappings to an already existing buffer.  All it
does is munge the PTEs in that virtual region to point to the buffer.

As long as the buffer is completely contiguous (which I don't know for
certain), then this function can be trivially reduced in NOMMU-mode to
something that just returns the address of the requested part of the buffer.
No remapping would be required.

However...  If the buffer is *not* completely contiguous, then you can still
perform mmaps of it - but only where the desired part _is_ contiguous.
Alternatively, you can arrange for the buffer to be completely contiguous
upfront.

Looking at alloc_pg_vec() in af_packet.c, I will place my bets on the latter
case.  I don't know that this is a problem; it depends on how things work, and
that I don't know offhand.  If someone can give me a simple test program, I
would be able to evaluate it better.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/8] AFS: Add security support and fix bugs

2007-04-11 Thread David Howells

These patches build on the patchset labelled AF_RXRPC socket family and AFS
rewrite.  The patches are also available for http download.

Firstly, the patches fix a number of bugs in AF_RXRPC:

http://people.redhat.com/~dhowells/rxrpc/09-af_rxrpc-own-workqueues.diff
http://people.redhat.com/~dhowells/rxrpc/10-af_rxrpc-fixes.diff

Secondly, they fix some bugs in the AFS filesystem:

http://people.redhat.com/~dhowells/rxrpc/11-afs-callback-wq.diff
http://people.redhat.com/~dhowells/rxrpc/12-afs-vlocation.diff
http://people.redhat.com/~dhowells/rxrpc/13-afs-multimount.diff

And finally, they add security support to AFS:

http://people.redhat.com/~dhowells/rxrpc/14-afs-rxrpc-key.diff
http://people.redhat.com/~dhowells/rxrpc/15-afs-nameidata-key.diff
http://people.redhat.com/~dhowells/rxrpc/16-afs-security.diff


A security key is acquired by running the klog program:

http://people.redhat.com/~dhowells/rxrpc/klog.c

This is compiled by:

make klog CFLAGS=-Wall -g LDLIBS=-lcrypto -lcrypt -lkrb4 -lkeyutils

And then run by:

./klog

Note that at the moment this is a rough and ready test program that has the
username, realm, password and proposed key timeout compiled in.  Note also that
it will only talk to the AFS kaserver.


If a security key is acquired, then all subsequent operations - including VL
lookups and mounts - performed with that session keyring will be authenticated
using that key.  The key can be viewed like so:

[EMAIL PROTECTED] ~]# keyctl show
Session Keyring
   -3 --alswrv  0 0  keyring: _ses.3268
2 --alswrv  0 0   \_ keyring: _uid.0
111416553 --als--v  0 0   \_ rxrpc: [EMAIL PROTECTED]

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/8] AFS: Fix callback aggregator work item deadlock

2007-04-11 Thread David Howells
Fix a deadlock in the give-up-callback aggregator dispatcher work item whereby
the aggregator runs on keventd as does timed autounmount, thus leading to the
unmount blocking keventd whilst waiting for keventd to run the aggregator when
the give-up-callback buffer is full.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/callback.c |   14 +-
 fs/afs/fsclient.c |6 --
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/fs/afs/callback.c b/fs/afs/callback.c
index fdad11c..1533b49 100644
--- a/fs/afs/callback.c
+++ b/fs/afs/callback.c
@@ -232,7 +232,8 @@ static void afs_do_give_up_callback(struct afs_server 
*server,
 * possible to ship in one operation */
switch (atomic_inc_return(server-cb_break_n)) {
case 1 ... AFSCBMAX - 1:
-   schedule_delayed_work(server-cb_break_work, HZ * 2);
+   queue_delayed_work(afs_callback_update_worker,
+  server-cb_break_work, HZ * 2);
break;
case AFSCBMAX:
afs_flush_callback_breaks(server);
@@ -271,9 +272,11 @@ void afs_give_up_callback(struct afs_vnode *vnode)
spin_lock(server-cb_lock);
if (vnode-cb_promised  afs_breakring_space(server) == 0) {
add_wait_queue(server-cb_break_waitq, myself);
-   while (vnode-cb_promised 
-  afs_breakring_space(server) == 0) {
+   for (;;) {
set_current_state(TASK_UNINTERRUPTIBLE);
+   if (!vnode-cb_promised ||
+   afs_breakring_space(server) != 0)
+   break;
spin_unlock(server-cb_lock);
schedule();
spin_lock(server-cb_lock);
@@ -315,7 +318,8 @@ void afs_dispatch_give_up_callbacks(struct work_struct 
*work)
 void afs_flush_callback_breaks(struct afs_server *server)
 {
if (try_to_cancel_delayed_work(server-cb_break_work) = 0)
-   schedule_delayed_work(server-cb_break_work, 0);
+   queue_delayed_work(afs_callback_update_worker,
+  server-cb_break_work, 0);
 }
 
 #if 0
@@ -426,7 +430,7 @@ static void afs_callback_updater(struct work_struct *work)
 int __init afs_callback_update_init(void)
 {
afs_callback_update_worker =
-   create_singlethread_workqueue(kafs_cbupdated);
+   create_singlethread_workqueue(kafs_callbackd);
return afs_callback_update_worker ? 0 : -ENOMEM;
 }
 
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index d955178..e2a36f8 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -355,10 +355,11 @@ int afs_fs_give_up_callbacks(struct afs_server *server,
__be32 *bp, *tp;
int loop;
 
-   _enter();
-
ncallbacks = CIRC_CNT(server-cb_break_head, server-cb_break_tail,
  ARRAY_SIZE(server-cb_break));
+
+   _enter({%zu},, ncallbacks);
+
if (ncallbacks == 0)
return 0;
if (ncallbacks  AFSCBMAX)
@@ -398,6 +399,7 @@ int afs_fs_give_up_callbacks(struct afs_server *server,
(ARRAY_SIZE(server-cb_break) - 1);
}
 
+   ASSERT(ncallbacks  0);
wake_up_nr(server-cb_break_waitq, ncallbacks);
 
return afs_make_call(server-addr, call, GFP_NOFS, wait_mode);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/8] AF_RXRPC: Use own workqueues

2007-04-11 Thread David Howells
Make the AF_RXRPC module use its own workqueues with their own per-CPU threads.
Currently it uses keventd to do the following tasks, amongst others:

 (*) Security negotiation

 (*) Packet encryption and decryption

 (*) Packet resending

 (*) ACK, abort and busy packet generation

 (*) Timeout handling

 (*) Missing packet catchup

 (*) Parts of incoming call management

 (*) Destruction of structures we've finished with

Some of these conflict with AFS's use of keventd, however, and can lead to
effective deadlock of resources.  Having discussed this, it has been suggested
that encryption and decryption shouldn't be done in keventd (that's not
unreasonable - it is potentially quite slow), and so the AF_RXRPC service is
given its own threads rather than AFS.

It might be useful to consider using the rpciod threads for this, if they were
separated out from the SunRPC module.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 net/rxrpc/af_rxrpc.c  |   17 ++---
 net/rxrpc/ar-accept.c |   12 ++--
 net/rxrpc/ar-ack.c|   10 +-
 net/rxrpc/ar-call.c   |   16 
 net/rxrpc/ar-connection.c |8 
 net/rxrpc/ar-connevent.c  |   20 ++--
 net/rxrpc/ar-error.c  |6 +++---
 net/rxrpc/ar-input.c  |   24 
 net/rxrpc/ar-internal.h   |   28 
 net/rxrpc/ar-local.c  |2 +-
 net/rxrpc/ar-output.c |4 ++--
 net/rxrpc/ar-peer.c   |2 +-
 net/rxrpc/ar-recvmsg.c|2 +-
 net/rxrpc/ar-skbuff.c |2 +-
 net/rxrpc/ar-transport.c  |8 
 15 files changed, 92 insertions(+), 69 deletions(-)

diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index fb35998..115ad19 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -41,6 +41,8 @@ atomic_t rxrpc_debug_id;
 /* count of skbs currently in use */
 atomic_t rxrpc_n_skbs;
 
+struct workqueue_struct *rxrpc_workqueue;
+
 static void rxrpc_sock_destructor(struct sock *);
 
 /*
@@ -688,7 +690,7 @@ static int rxrpc_release_sock(struct sock *sk)
 
/* try to flush out this socket */
rxrpc_release_calls_on_socket(rx);
-   flush_scheduled_work();
+   flush_workqueue(rxrpc_workqueue);
rxrpc_purge_queue(sk-sk_receive_queue);
 
if (rx-conn) {
@@ -785,15 +787,21 @@ static int __init af_rxrpc_init(void)
 
rxrpc_epoch = htonl(xtime.tv_sec);
 
+   ret = -ENOMEM;
rxrpc_call_jar = kmem_cache_create(
rxrpc_call_jar, sizeof(struct rxrpc_call), 0,
SLAB_HWCACHE_ALIGN, NULL, NULL);
if (!rxrpc_call_jar) {
printk(KERN_NOTICE RxRPC: Failed to allocate call jar\n);
-   ret = -ENOMEM;
goto error_call_jar;
}
 
+   rxrpc_workqueue = create_workqueue(krxrpcd);
+   if (!rxrpc_workqueue) {
+   printk(KERN_NOTICE RxRPC: Failed to allocate work queue\n);
+   goto error_work_queue;
+   }
+
ret = proto_register(rxrpc_proto, 1);
 if (ret  0) {
 printk(KERN_CRIT RxRPC: Cannot register protocol\n);
@@ -831,6 +839,8 @@ error_key_type:
 error_sock:
proto_unregister(rxrpc_proto);
 error_proto:
+   destroy_workqueue(rxrpc_workqueue);
+error_work_queue:
kmem_cache_destroy(rxrpc_call_jar);
 error_call_jar:
return ret;
@@ -855,9 +865,10 @@ static void __exit af_rxrpc_exit(void)
ASSERTCMP(atomic_read(rxrpc_n_skbs), ==, 0);
 
_debug(flush scheduled work);
-   flush_scheduled_work();
+   flush_workqueue(rxrpc_workqueue);
proc_net_remove(rxrpc_conns);
proc_net_remove(rxrpc_calls);
+   destroy_workqueue(rxrpc_workqueue);
kmem_cache_destroy(rxrpc_call_jar);
_leave();
 }
diff --git a/net/rxrpc/ar-accept.c b/net/rxrpc/ar-accept.c
index 405092d..73243ab 100644
--- a/net/rxrpc/ar-accept.c
+++ b/net/rxrpc/ar-accept.c
@@ -139,7 +139,7 @@ static int rxrpc_accept_incoming_call(struct rxrpc_local 
*local,
call-conn-state = RXRPC_CONN_SERVER_CHALLENGING;
atomic_inc(call-conn-usage);
set_bit(RXRPC_CONN_CHALLENGE, call-conn-events);
-   schedule_work(call-conn-processor);
+   rxrpc_queue_conn(call-conn);
} else {
_debug(conn ready);
call-state = RXRPC_CALL_SERVER_ACCEPTING;
@@ -183,7 +183,7 @@ invalid_service:
if (!test_bit(RXRPC_CALL_RELEASE, call-flags) 
!test_and_set_bit(RXRPC_CALL_RELEASE, call-events)) {
rxrpc_get_call(call);
-   schedule_work(call-processor);
+   rxrpc_queue_call(call);
}
read_unlock_bh(call-state_lock);
rxrpc_put_call(call);
@@ -375,7 +375,7 @@ struct rxrpc_call *rxrpc_accept_call(struct rxrpc_sock *rx,
BUG();
if (test_and_set_bit

[PATCH 2/8] AF_RXRPC: Lower dead call timeout and fix available call counting on connections

2007-04-11 Thread David Howells
Make a couple of fixes to AF_RXRPC:

 (1) The dead call timeout is shortened to 2 seconds.  Without this, each
 completed call sits around eating up resources for 10 seconds.  The calls
 need to hang around for a little while in case duplicate packets appear,
 but 10 seconds is excessive.

 (2) The number of available calls on a connection (conn-avail_calls) wasn't
 being decremented when a new call was allocated for a connection that
 didn't have any calls in progress.  This an occasional BUG occurring when
 we tried to find an empty channel slot on a connection that was supposed
 to have one available and didn't.

 In association with this, more assertions have been added to check this.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 net/rxrpc/ar-call.c   |   59 +
 net/rxrpc/ar-connection.c |   20 ++-
 2 files changed, 56 insertions(+), 23 deletions(-)

diff --git a/net/rxrpc/ar-call.c b/net/rxrpc/ar-call.c
index 1d7698a..4d92d88 100644
--- a/net/rxrpc/ar-call.c
+++ b/net/rxrpc/ar-call.c
@@ -19,7 +19,7 @@ struct kmem_cache *rxrpc_call_jar;
 LIST_HEAD(rxrpc_calls);
 DEFINE_RWLOCK(rxrpc_call_lock);
 static unsigned rxrpc_call_max_lifetime = 60;
-static unsigned rxrpc_dead_call_timeout = 10;
+static unsigned rxrpc_dead_call_timeout = 2;
 
 static void rxrpc_destroy_call(struct work_struct *work);
 static void rxrpc_call_life_expired(unsigned long _call);
@@ -398,6 +398,7 @@ found_extant_call:
  */
 void rxrpc_release_call(struct rxrpc_call *call)
 {
+   struct rxrpc_connection *conn = call-conn;
struct rxrpc_sock *rx = call-socket;
 
_enter({%d,%d,%d,%d},
@@ -413,8 +414,7 @@ void rxrpc_release_call(struct rxrpc_call *call)
/* dissociate from the socket
 * - the socket's ref on the call is passed to the death timer
 */
-   _debug(RELEASE CALL %p (%d CONN %p),
-  call, call-debug_id, call-conn);
+   _debug(RELEASE CALL %p (%d CONN %p), call, call-debug_id, conn);
 
write_lock_bh(rx-call_lock);
if (!list_empty(call-accept_link)) {
@@ -430,24 +430,42 @@ void rxrpc_release_call(struct rxrpc_call *call)
}
write_unlock_bh(rx-call_lock);
 
-   if (call-conn-out_clientflag)
-   spin_lock(call-conn-trans-client_lock);
-   write_lock_bh(call-conn-lock);
-
/* free up the channel for reuse */
-   if (call-conn-out_clientflag) {
-   call-conn-avail_calls++;
-   if (call-conn-avail_calls == RXRPC_MAXCALLS)
-   list_move_tail(call-conn-bundle_link,
-  call-conn-bundle-unused_conns);
-   else if (call-conn-avail_calls == 1)
-   list_move_tail(call-conn-bundle_link,
-  call-conn-bundle-avail_conns);
+   spin_lock(conn-trans-client_lock);
+   write_lock_bh(conn-lock);
+   write_lock(call-state_lock);
+
+   if (conn-channels[call-channel] == call)
+   conn-channels[call-channel] = NULL;
+
+   if (conn-out_clientflag  conn-bundle) {
+   conn-avail_calls++;
+   switch (conn-avail_calls) {
+   case 1:
+   list_move_tail(conn-bundle_link,
+  conn-bundle-avail_conns);
+   case 2 ... RXRPC_MAXCALLS - 1:
+   ASSERT(conn-channels[0] == NULL ||
+  conn-channels[1] == NULL ||
+  conn-channels[2] == NULL ||
+  conn-channels[3] == NULL);
+   break;
+   case RXRPC_MAXCALLS:
+   list_move_tail(conn-bundle_link,
+  conn-bundle-unused_conns);
+   ASSERT(conn-channels[0] == NULL 
+  conn-channels[1] == NULL 
+  conn-channels[2] == NULL 
+  conn-channels[3] == NULL);
+   break;
+   default:
+   printk(KERN_ERR RxRPC: conn-avail_calls=%d\n,
+  conn-avail_calls);
+   BUG();
+   }
}
 
-   write_lock(call-state_lock);
-   if (call-conn-channels[call-channel] == call)
-   call-conn-channels[call-channel] = NULL;
+   spin_unlock(conn-trans-client_lock);
 
if (call-state  RXRPC_CALL_COMPLETE 
call-state != RXRPC_CALL_CLIENT_FINAL_ACK) {
@@ -458,10 +476,9 @@ void rxrpc_release_call(struct rxrpc_call *call)
rxrpc_queue_call(call);
}
write_unlock(call-state_lock);
-   write_unlock_bh(call-conn-lock);
-   if (call-conn-out_clientflag)
-   spin_unlock(call-conn-trans-client_lock);
+   write_unlock_bh(conn-lock);
 
+   /* clean up the Rx queue

[PATCH 7/8] AFS: Permit key to be cached in nameidata

2007-04-11 Thread David Howells
Permit a key to be cached in the nameidata struct so that it only needs to be
looked up once when doing the sequence of d_revalidate(), permission(),
follow_link() and lookup() calls involved in a pathwalk.

This is used by the AFS filesystem to avoid repeatedly having to call
request_key().  Once looked up, the key is then available as the kernel walks
to the tree until such a time as the kernel crosses to a non-AFS mountpoint or
an AFS mountpoint in a different cell.

The cache works like this:

 (1) The nameidata::key pointer is initialised to NULL at the start of the
 pathwalk (do_path_lookup()).  path_release() and co. release the key it
 points to.

 (2) Any filesystem operation performed during the pathwalk that has access to
 the nameidata (lookup, permission, follow_link, d_revalidate) can look at
 the key - if non-NULL - and if it's what they're looking for they can use
 it.

 If there's a key there of potential interest, the key's type and
 description should be checked to make sure the key is permissible.

 If of interest, key_validate() should be called to make sure the key is
 still usable.  If it isn't, the error should be passed back rather than
 the key lookup being redone on the basis that some earlier step is now no
 longer valid.

 (3) Any operation that is not interested in the key can either ignore it or
 release it and clear the pointer.

 (4) If an operation wants to put its own key there, it should release the old
 key and set the pointer to point to its own key with the key's usage count
 incremented.  This could be encapsulated in a function something like
 this:

void set_nd_key(struct nameidata *nd, struct key *key)
{
key_put(nd-key);
nd-key = key_get(key);
}

Unfortunately there isn't currently a way to pass the key onto the inode
operations for create(), link(), unlink(), and suchlike, nor is there a way to
pass it to the open() file op without adding a struct key pointer argument to
each of these.

This might also be useful for NFS and CIFS.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/namei.c|5 +
 fs/open.c |7 +--
 include/linux/namei.h |1 +
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index ee60cc4..7a59d12 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -350,6 +350,8 @@ void path_release(struct nameidata *nd)
 {
dput(nd-dentry);
mntput(nd-mnt);
+   key_put(nd-key);
+   nd-key = NULL;
 }
 
 /*
@@ -360,6 +362,8 @@ void path_release_on_umount(struct nameidata *nd)
 {
dput(nd-dentry);
mntput_no_expire(nd-mnt);
+   key_put(nd-key);
+   nd-key = NULL;
 }
 
 /**
@@ -1108,6 +1112,7 @@ static int fastcall do_path_lookup(int dfd, const char 
*name,
struct file *file;
struct fs_struct *fs = current-fs;
 
+   nd-key = NULL;
nd-last_type = LAST_ROOT; /* if there are only slashes... */
nd-flags = flags;
nd-depth = 0;
diff --git a/fs/open.c b/fs/open.c
index c989fb4..77bd2a5 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -822,10 +822,13 @@ struct file *nameidata_to_filp(struct nameidata *nd, int 
flags)
/* Pick up the filp from the open intent */
filp = nd-intent.open.file;
/* Has the filesystem initialised the file for us? */
-   if (filp-f_path.dentry == NULL)
+   if (filp-f_path.dentry == NULL) {
filp = __dentry_open(nd-dentry, nd-mnt, flags, filp, NULL);
-   else
+   key_put(nd-key);
+   nd-key = NULL;
+   } else {
path_release(nd);
+   }
return filp;
 }
 
diff --git a/include/linux/namei.h b/include/linux/namei.h
index d39a5a6..d677408 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -17,6 +17,7 @@ enum { MAX_NESTED_LINKS = 8 };
 struct nameidata {
struct dentry   *dentry;
struct vfsmount *mnt;
+   struct key  *key;
struct qstr last;
unsigned intflags;
int last_type;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/8] AFS: Handle multiple mounts of an AFS superblock correctly

2007-04-11 Thread David Howells
Handle multiple mounts of an AFS superblock correctly, checking to see whether
the superblock is already initialised after calling sget() rather than just
unconditionally stamping all over it.

Also delete the silent parameter to afs_fill_super() as it's not used and
can, in any case, be obtained from sb-s_flags.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/super.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/afs/super.c b/fs/afs/super.c
index efc4fe6..77e6875 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -212,7 +212,7 @@ static int afs_test_super(struct super_block *sb, void 
*data)
 /*
  * fill in the superblock
  */
-static int afs_fill_super(struct super_block *sb, void *data, int silent)
+static int afs_fill_super(struct super_block *sb, void *data)
 {
struct afs_mount_params *params = data;
struct afs_super_info *as = NULL;
@@ -319,17 +319,23 @@ static int afs_get_sb(struct file_system_type *fs_type,
goto error;
}
 
-   sb-s_flags = flags;
-
-   ret = afs_fill_super(sb, params, flags  MS_SILENT ? 1 : 0);
-   if (ret  0) {
-   up_write(sb-s_umount);
-   deactivate_super(sb);
-   goto error;
+   if (!sb-s_root) {
+   /* initial superblock/root creation */
+   _debug(create);
+   sb-s_flags = flags;
+   ret = afs_fill_super(sb, params);
+   if (ret  0) {
+   up_write(sb-s_umount);
+   deactivate_super(sb);
+   goto error;
+   }
+   sb-s_flags |= MS_ACTIVE;
+   } else {
+   _debug(reuse);
+   ASSERTCMP(sb-s_flags, , MS_ACTIVE);
}
-   sb-s_flags |= MS_ACTIVE;
-   simple_set_mnt(mnt, sb);
 
+   simple_set_mnt(mnt, sb);
afs_put_volume(params.volume);
afs_put_cell(params.default_cell);
_leave( = 0 [%p], sb);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] AFS: AF_RXRPC key changes

2007-04-11 Thread David Howells
Make two changes to the AF_RXRPC key handling to make it easier for AFS to
use:

 (1) Export key_type_rxrpc so that AFS can request keys of this type.

 (2) Make it possible to have keys that represent no security.  These are
 created by instantiating the keys with no data.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/keys/rxrpc-type.h |   22 ++
 net/rxrpc/af_rxrpc.c  |2 ++
 net/rxrpc/ar-key.c|   10 +-
 net/rxrpc/ar-output.c |6 +-
 4 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/include/keys/rxrpc-type.h b/include/keys/rxrpc-type.h
new file mode 100644
index 000..e2ee73a
--- /dev/null
+++ b/include/keys/rxrpc-type.h
@@ -0,0 +1,22 @@
+/* RxRPC key type
+ *
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells ([EMAIL PROTECTED])
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _KEYS_RXRPC_TYPE_H
+#define _KEYS_RXRPC_TYPE_H
+
+#include linux/key.h
+
+/*
+ * key type for AF_RXRPC keys
+ */
+extern struct key_type key_type_rxrpc;
+
+#endif /* _KEYS_USER_TYPE_H */
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 115ad19..9e37e4f 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -299,6 +299,8 @@ struct rxrpc_call *rxrpc_kernel_begin_call(struct socket 
*sock,
 
if (!key)
key = rx-key;
+   if (key  !key-payload.data)
+   key = NULL; /* a no-security key */
 
bundle = rxrpc_get_bundle(rx, trans, key, service_id, gfp);
if (IS_ERR(bundle)) {
diff --git a/net/rxrpc/ar-key.c b/net/rxrpc/ar-key.c
index 869a96c..7e049ff 100644
--- a/net/rxrpc/ar-key.c
+++ b/net/rxrpc/ar-key.c
@@ -19,6 +19,7 @@
 #include linux/crypto.h
 #include net/sock.h
 #include net/af_rxrpc.h
+#include keys/rxrpc-type.h
 #include keys/user-type.h
 #include ar-internal.h
 
@@ -40,6 +41,8 @@ struct key_type key_type_rxrpc = {
.describe   = rxrpc_describe,
 };
 
+EXPORT_SYMBOL(key_type_rxrpc);
+
 /*
  * rxrpc server defined keys take serviceId:securityIndex as the
  * description and an 8-byte decryption key as the payload
@@ -63,6 +66,8 @@ struct key_type key_type_rxrpc_s = {
  * 12  4   kvno
  * 16  8   session key
  * 24  [len]   ticket
+ *
+ * if no data is provided, then a no-security key is made
  */
 static int rxrpc_instantiate(struct key *key, const void *data, size_t datalen)
 {
@@ -74,6 +79,10 @@ static int rxrpc_instantiate(struct key *key, const void 
*data, size_t datalen)
 
_enter({%x},,%zu, key_serial(key), datalen);
 
+   /* handle a no-security key */
+   if (!data  datalen == 0)
+   return 0;
+
/* get the key interface version number */
ret = -EINVAL;
if (datalen = 4 || !data)
@@ -287,7 +296,6 @@ int rxrpc_get_server_data_key(struct rxrpc_connection *conn,
struct rxkad_key tsec;
} data;
 
-
_enter();
 
key = key_alloc(key_type_rxrpc, x, 0, 0, current, 0,
diff --git a/net/rxrpc/ar-output.c b/net/rxrpc/ar-output.c
index ed7f3f4..d2d0baa 100644
--- a/net/rxrpc/ar-output.c
+++ b/net/rxrpc/ar-output.c
@@ -132,6 +132,7 @@ int rxrpc_client_sendmsg(struct kiocb *iocb, struct 
rxrpc_sock *rx,
enum rxrpc_command cmd;
struct rxrpc_call *call;
unsigned long user_call_ID = 0;
+   struct key *key;
__be16 service_id;
u32 abort_code = 0;
int ret;
@@ -153,7 +154,10 @@ int rxrpc_client_sendmsg(struct kiocb *iocb, struct 
rxrpc_sock *rx,
(struct sockaddr_rxrpc *) msg-msg_name;
service_id = htons(srx-srx_service);
}
-   bundle = rxrpc_get_bundle(rx, trans, rx-key, service_id,
+   key = rx-key;
+   if (key  !rx-key-payload.data)
+   key = NULL;
+   bundle = rxrpc_get_bundle(rx, trans, key, service_id,
  GFP_KERNEL);
if (IS_ERR(bundle))
return PTR_ERR(bundle);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/8] AFS: Correctly alter relocation state after update and show state in /proc

2007-04-11 Thread David Howells
Correctly alter the relocation state after update is complete by switching it
from Updating to Valid.

Also display the record state in the vlocation database proc file.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/proc.c  |   15 +--
 fs/afs/vlocation.c |4 +++-
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 9e6af40..d5601f6 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -553,6 +553,16 @@ static void afs_proc_cell_volumes_stop(struct seq_file *p, 
void *v)
up_read(cell-vl_sem);
 }
 
+const char afs_vlocation_states[][4] = {
+   [AFS_VL_NEW]= New,
+   [AFS_VL_CREATING]   = Crt,
+   [AFS_VL_VALID]  = Val,
+   [AFS_VL_NO_VOLUME]  = NoV,
+   [AFS_VL_UPDATING]   = Upd,
+   [AFS_VL_VOLUME_DELETED] = Del,
+   [AFS_VL_UNCERTAIN]  = Unc,
+};
+
 /*
  * display a header line followed by a load of volume lines
  */
@@ -563,13 +573,14 @@ static int afs_proc_cell_volumes_show(struct seq_file *m, 
void *v)
 
/* display header on line 1 */
if (v == (void *) 1) {
-   seq_puts(m, USE VLID[0]  VLID[1]  VLID[2]  NAME\n);
+   seq_puts(m, USE STT VLID[0]  VLID[1]  VLID[2]  NAME\n);
return 0;
}
 
/* display one cell per line on subsequent lines */
-   seq_printf(m, %3d %08x %08x %08x %s\n,
+   seq_printf(m, %3d %s %08x %08x %08x %s\n,
   atomic_read(vlocation-usage),
+  afs_vlocation_states[vlocation-state],
   vlocation-vldb.vid[0],
   vlocation-vldb.vid[1],
   vlocation-vldb.vid[2],
diff --git a/fs/afs/vlocation.c b/fs/afs/vlocation.c
index f0f4419..9af1fe8 100644
--- a/fs/afs/vlocation.c
+++ b/fs/afs/vlocation.c
@@ -657,7 +657,7 @@ static void afs_vlocation_updater(struct work_struct *work)
switch (ret) {
case 0:
afs_vlocation_apply_update(vl, vldb);
-   vl-state = AFS_VL_UPDATING;
+   vl-state = AFS_VL_VALID;
break;
case -ENOMEDIUM:
vl-state = AFS_VL_VOLUME_DELETED;
@@ -691,6 +691,8 @@ static void afs_vlocation_updater(struct work_struct *work)
timeout = afs_vlocation_update_timeout;
}
 
+   ASSERT(list_empty(vl-update));
+
list_add_tail(vl-update, afs_vlocation_updates);
 
_debug(timeout %ld, timeout);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] AFS: Add security support

2007-04-11 Thread David Howells
J. Bruce Fields [EMAIL PROTECTED] wrote:

 Just curious--when is the actual crypto done?  There doesn't seem to be
 any in this patch.

See AF_RXRPC patch:

http://people.redhat.com/~dhowells/rxrpc/04-af_rxrpc.diff

You turn on CONFIG_RXKAD and load the rxkad module thus built (assuming you
haven't built it in) after loading the af_rxrpc module.  I probably should've
mentioned that in the cover.

So anyone using sockets of family AF_RXRPC can use it.  See these test
programs:

 (1) The klog test program fetches a ticket from the kaserver and adds it as a
 key of type rxrpc:

http://people.redhat.com/~dhowells/rxrpc/klog.c

 (2) The listen test program which listens for potentially secured incoming
 calls:

http://people.redhat.com/~dhowells/rxrpc/listen.c

 (3) The rxrpc test program which can make secure calls:

http://people.redhat.com/~dhowells/rxrpc/rxrpc.c

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting a network interface list from within the kernel

2007-04-13 Thread David Howells
David Miller [EMAIL PROTECTED] wrote:

 Issue a RTM_GETLINK rtnetlink request, and parse the response.

Okay, I've managed to find code that does this.  However, RTM_GETLINK does not
appear to return any IPv4 addressing information.  It does, however, contain
the MTU details which is one of the three things I wanted.

I found that RTM_GETADDR will give me the IPv4 address and something from
which I can calculate the netmask.

I don't suppose there's a single op that will allow me to get all three in one
go?

Oh, and can I assume that the interface index numbers returned by RTM_GETLINK
match those returned by RTM_GETADDR?  Even if an interface is removed between
issuing the two calls?  Alternatively, do I need to compare interface names as
those are available between both?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Can netlink_recvmsg not truncate messages unless asked to?

2007-04-13 Thread David Howells

Would it be feasible to make netlink_recvmsg() _not_ truncate message unless
it is asked to by having MSG_TRUNC passed to it?

Unless netlink data packets are limited to PAGE_SIZE or less, it's entirely
possible that the kernel can be in a situation where it can't guarantee to get
a buffer large enough to receive a packet larger than that.

What I was trying to do was use recvmsg with MSG_PEEK to grab the nlmsghdr,
and then using that to predict the size of the buffer I need.  But that
doesn't work because a packet might contain multiple netlink messages.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Possible bug in netlink_recvmsg()

2007-04-13 Thread David Howells

As I understand it, according to the recvmsg() manual page, if the packet
being returned is larger than the buffer provided, and the protocol does not
support piecemeal reception of data, then:

 (1) the buffer should be filled,

 (2) MSG_TRUNC should be set in msg_flags, and

 (3) the length of the full packet, including the discarded bit should be
 returned.

AF_NETLINK sockets, however, do not do (3).  See this bit in netlink_recvmsg():

copied = skb-len;
if (len  copied) {
msg-msg_flags |= MSG_TRUNC;
copied = len;
}

Or is this only true if the caller of recvmsg() passes MSG_TRUNC in?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] AF_NETLINK: Support MSG_TRUNC passed to recvmsg()

2007-04-13 Thread David Howells
Support MSG_TRUNC when passed to recvmsg() as an argument on an AF_NETLINK
socket.  In such a case, the full size of the packet at the front of the Rx
queue should be returned, including any of it discarded when MSG_TRUNC is set
by recvmsg() on return.

If MSG_TRUNC is not set, then only the amount of data read into the buffer is
returned, and any discarded data goes uncounted.

This is according to the recvmsg() manual page.  AFS will make use of this
feature to work out the buffer size required to receive a netlink message by
combining MSG_TRUNC with MSG_PEEK.

This feature is useful on netlink sockets as recvmsg() there just discards any
of the packet that won't fit in the buffer.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 net/netlink/af_netlink.c |   15 +--
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index e73d8f5..6288dd1 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1194,7 +1194,7 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct 
socket *sock,
struct sock *sk = sock-sk;
struct netlink_sock *nlk = nlk_sk(sk);
int noblock = flagsMSG_DONTWAIT;
-   size_t copied;
+   size_t copy, copied;
struct sk_buff *skb;
int err;
 
@@ -1209,14 +1209,17 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct 
socket *sock,
 
msg-msg_namelen = 0;
 
-   copied = skb-len;
-   if (len  copied) {
-   msg-msg_flags |= MSG_TRUNC;
-   copied = len;
+   copied = copy = skb-len;
+   if (len  copy) {
+   copy = len;
+   if (!(flags  MSG_TRUNC)) {
+   copied = len;
+   msg-msg_flags |= MSG_TRUNC;
+   }
}
 
skb-h.raw = skb-data;
-   err = skb_copy_datagram_iovec(skb, 0, msg-msg_iov, copied);
+   err = skb_copy_datagram_iovec(skb, 0, msg-msg_iov, copy);
 
if (msg-msg_name) {
struct sockaddr_nl *addr = (struct sockaddr_nl*)msg-msg_name;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-17 Thread David Howells
Robin Getz [EMAIL PROTECTED] wrote:

 David - Does this give you what you need?

Possibly.  I'll have a look at it tomorrow.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-18 Thread David Howells
Aubrey Li [EMAIL PROTECTED] wrote:

 Here, in the attachment I wrote a small test app. Please correct if
 there is anything wrong, and feel free to improve it.

Okay... I have that working... probably.  I don't know what output it's
supposed to produce, but I see this:

# /packet-mmap/sample_packet_mmap
00-00-00-01-00-00-00-8a-00-00-00-8a-00-42-00-50-
38-43-13-a0-00-07-ff-3c-00-00-00-00-00-00-00-00-
00-11-08-00-00-00-00-01-00-01-00-06-00-d0-b7-de-
32-7b-00-00-00-00-00-00-00-00-00-00-00-00-00-00-
00-00-00-90-cc-a2-75-6b-00-d0-b7-de-32-7b-08-00-
45-00-00-7c-00-00-40-00-40-11-b4-13-c0-a8-02-80-
c0-a8-02-8d-08-01-03-20-00-68-8e-65-7f-5b-7e-03-
00-00-00-01-00-00-00-00-00-00-00-00-00-00-00-00-
00-00-00-00-00-00-00-00-00-00-00-01-00-00-81-a4-
00-00-00-01-00-00-00-00-00-00-00-00-00-1d-b8-86-
00-00-10-00-ff-ff-ff-ff-00-00-0e-f0-00-00-09-02-
01-cb-03-16-46-26-38-0d-00-00-00-00-46-26-38-1e-
00-00-00-00-46-26-38-1e-00-00-00-00-00-00-00-00-
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00- [repeated]

Does that look reasonable?

I've attached the preliminary patch.  Note four things about it:

 (1) I've had to add the get_unmapped_area() op to the proto_ops struct, but
 I've only done it for CONFIG_MMU=n as making it available for CONFIG_MMU=y
 could cause problems.

 (2) There's a race between packet_get_unmapped_area() being called and
 packet_mmap() being called.

 (3) I've added an extra check into packet_set_ring() to make sure the caller
 isn't asking for a combination of buffer size and count that will exceed
 ULONG_MAX.  This protects a multiply done elsewhere.

 (4) The entire data buffer is allocated as one contiguous lump in NOMMU-mode.

David

---
[PATCH] NOMMU: Support mmap() on AF_PACKET sockets

From: David Howells [EMAIL PROTECTED]

Support mmap() on AF_PACKET sockets in NOMMU-mode kernels.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/net.h|7 +++
 include/net/sock.h |8 +++
 net/core/sock.c|   10 
 net/packet/af_packet.c |  118 
 net/socket.c   |   77 +++
 5 files changed, 219 insertions(+), 1 deletions(-)

diff --git a/include/linux/net.h b/include/linux/net.h
index 4db21e6..9e77cf6 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -161,6 +161,11 @@ struct proto_ops {
int (*recvmsg)   (struct kiocb *iocb, struct socket *sock,
  struct msghdr *m, size_t total_len,
  int flags);
+#ifndef CONFIG_MMU
+   unsigned long   (*get_unmapped_area)(struct file *file, struct socket 
*sock,
+unsigned long addr, unsigned long 
len,
+unsigned long pgoff, unsigned long 
flags);
+#endif
int (*mmap)  (struct file *file, struct socket *sock,
  struct vm_area_struct * vma);
ssize_t (*sendpage)  (struct socket *sock, struct page *page,
@@ -191,6 +196,8 @@ extern int   sock_sendmsg(struct socket *sock, 
struct msghdr *msg,
 extern int  sock_recvmsg(struct socket *sock, struct msghdr *msg,
  size_t size, int flags);
 extern int  sock_map_fd(struct socket *sock);
+extern void sock_make_mappable(struct socket *sock,
+   unsigned long prot);
 extern struct socket *sockfd_lookup(int fd, int *err);
 #define sockfd_put(sock) fput(sock-file)
 extern int  net_ratelimit(void);
diff --git a/include/net/sock.h b/include/net/sock.h
index 2c7d60c..d91edea 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -841,6 +841,14 @@ extern int  sock_no_sendmsg(struct 
kiocb *, struct socket *,
struct msghdr *, size_t);
 extern int  sock_no_recvmsg(struct kiocb *, struct socket 
*,
struct msghdr *, size_t, int);
+#ifndef CONFIG_MMU
+extern unsigned long   sock_no_get_unmapped_area(struct file *,
+ struct socket *,
+ unsigned long,
+ unsigned long,
+ unsigned long,
+ unsigned long);
+#endif
 extern int sock_no_mmap(struct file *file,
 struct socket *sock,
 struct vm_area_struct *vma);
diff --git a/net/core/sock.c b/net/core/sock.c
index 27c4f62..b288799

Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-19 Thread David Howells
Aubrey Li [EMAIL PROTECTED] wrote:

 Yes, it's reasonable for me, as long as your
 host IP is 192.168.2.128
 and
 target IP is 192.168.2.141

That is correct, yes:-)

I expect it's an NFS packet as my board is using an NFS root at the moment.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Getting the new RxRPC patches upstream

2007-04-19 Thread David Howells
Eric W. Biederman [EMAIL PROTECTED] wrote:

 What is the ETA on your patches?

That depends on Dave Miller now, I think.  I'm assuming they need to go
through the network GIT tree to get to Linus.  Certainly Andrew Morton seems
to think so.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in netlink_recvmsg()

2007-04-19 Thread David Howells
David Miller [EMAIL PROTECTED] wrote:

 See this fix in my net-2.6.22 tree:
 
 commit ad495d7b6cfcd1bc2eaf06c42699be0bb5d84234
 Author: David S. Miller [EMAIL PROTECTED]
 Date:   Tue Mar 6 17:02:35 2007 -0800

Ummm... That seems to conflict with something in your net-2.6 tree.  Which one
should I use?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-19 Thread David Howells
Eric W. Biederman [EMAIL PROTECTED] wrote:

 Ok.  I don't see any patches in -mm so I was assuming these patches have
 not been queued up anywhere.

They haven't been quite yet.  Is it your intention to kill these features in
2.6.22?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-20 Thread David Howells
Aubrey Li [EMAIL PROTECTED] wrote:

 The patch works properly on my side. But
 1) I'm not sure why you re-wrote alloc/free_pg_vec function, doesn't
 the current implement work for NOMMU? I know you want to allocate the
 entire data buffer as one contiguous lump, but is it really necessary?

Yes.  It's not possible to map the whole buffer otherwise.  Think about it!
mmap() returns _one_ reference address.  In MMU-mode, the non-contiguous
physical buffers can be made to appear virtually contiguous by fudging the
page tables and using the MMU.  This is not possible in NOMMU-mode.  The app
will expect the buffer to be one contiguous lump in its address space, and
will not be able to locate the other segments of the buffer.

Actually, what I said is not quite true.  It is possible to map the whole
buffer otherwise: I could lift the restriction that requires that you map the
whole buffer or not at all, and then userspace could stitch the whole lot
together itself.  This would then require userspace to be bimodal.

 2) So the mapped pages doesn't count into NR_FILE_MAPPED, is it a problem?

Not really, no - there are no pagetables.

Furthermore, issuing the PACKET_RX_RING sockopt does the entire allocation.
Any subsequent mmaps on it have little effect.

We could do that accounting though if you think it'd be better.  I don't
suppose it hurts.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-20 Thread David Howells
David Miller [EMAIL PROTECTED] wrote:

 I applied already the patches I thought were appropriate,
 you had some crypto layer changes that you need to work
 out with Herbert Xu before the rest can be applied.

Should the rest of it go via Andrew's tree then?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-20 Thread David Howells
Aubrey Li [EMAIL PROTECTED] wrote:

 as checked in packet_set_ring, buffer size must be a multiple of PAGE_SIZE,
 packet_set_ring
 if (unlikely(req-tp_block_size  (PAGE_SIZE - 1)))
 
 So why not use __get_free_pages rather than kmalloc,

Because kmalloc() may be able to get us a smaller chunk of memory.  Actually,
calling __get_free_pages() might be a better, and then release the excess
pages.

 so that we have pagetables to count?

There are no pagetables in NOMMU-mode.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-20 Thread David Howells
David Miller [EMAIL PROTECTED] wrote:

 Now that Herbert cleared up the crypto layer issues
 the only problem left is that there are generic changes
 in there which are not strictly networking but which
 your subsequent networking changes depend upon.

 This is a mess, and makes merging your work into the
 net-2.6.22 tree more difficult.

There are only two non-net patches that AF_RXRPC depends on:

 (1) The key facility changes.  That's all my code anyway, and shouldn't be a
 problem to merge unless someone else has put some changes in there that I
 don't know about.

 (2) try_to_cancel_delayed_work().  I suppose I could use
 cancel_delayed_work() instead, but that's less efficient as it waits for
 the timer completion function to finish.

And one that AFS depends on:

 (3) Cache the key in nameidata.  I still don't have Al's agreement on this,
 but it's purely caching, so I could drop that patch for the moment and
 excise the stuff that uses it from my AFS patches if that would help.

Do you class the AFS patches as networking changes?

Do you want me to consolidate my patches to make things simpler for you?

Do you want me to rebase my patches onto net-2.6.22?

I have the following patches, in order, available now, though I haven't yet
released the last few (they can all be downloaded from my RH people pages):

move-skb-generic.diff  (you've got this)
timers.diff
keys.diff
af_rxrpc.diff
afs-cleanup.diff
af_rxrpc-kernel.diff
af_rxrpc-afs.diff
af_rxrpc-delete-old.diff
af_rxrpc-own-workqueues.diff
af_rxrpc-fixes.diff
afs-callback-wq.diff
afs-vlocation.diff
afs-multimount.diff
afs-rxrpc-key.diff
afs-nameidata-key.diff
afs-security.diff
afs-doc.diff
netlink-support-MSG_TRUNC.diff  (you've got this)
afs-get-capabilities.diff
afs-initcallbackstate3.diff
afs-dir-write-support.diff

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-20 Thread David Howells
Eric Dumazet [EMAIL PROTECTED] wrote:

 Is it really possible to allocate an order-10 page, then release part of it
 (say an order-8 subpage) ?

Yes.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-23 Thread David Howells

 We only care when del_timer() returns true. In that case, if the timer
 function still runs (possible for single-threaded wqs), it has already
 passed __queue_work().

Why do you assume that?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-24 Thread David Howells
Oleg Nesterov [EMAIL PROTECTED] wrote:

   We only care when del_timer() returns true. In that case, if the timer
   function still runs (possible for single-threaded wqs), it has already
   passed __queue_work().
  
  Why do you assume that?

Sorry, I should have been more clear.  I meant the assumption that we only
care about a true return from del_timer().

 If del_timer() returns true, the timer was pending. This means it was
 started by work-func() (note that __run_timers() clears timer_pending()
 before calling timer-function). This in turn means that
 delayed_work_timer_fn() has already called __queue_work(dwork), otherwise
 work-func() has no chance to run.

But if del_timer() returns 0, then there may be a problem.  We can't tell the
difference between the following two cases:

 (1) The timer hadn't been started.

 (2) The timer had been started, has expired and is no longer pending, but
 another CPU is running its handler routine.

try_to_del_timer_sync() _does_, however, distinguish between these cases: the
first is the 0 return, the second is the -1 return, and the case where it
dequeued the timer is the 1 return.

BTW, can a timer handler be preempted?  I assume not...  But it can be delayed
by interrupt processing.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-24 Thread David Howells
Oleg Nesterov [EMAIL PROTECTED] wrote:

   The current code uses del_timer_sync(). It will also return 0. However,
   it will spin waiting for timer-function() to complete. So we are just
   wasting CPU.
  
  That's my objection to using cancel_delayed_work() as it stands, although in
  most cases it's a relatively minor waste of time.  However, if the timer
  expiry routine gets interrupted then it may not be so minor...  So, yes, I'm
  in full agreement with you there.
 
 Great. I'll send the s/del_timer_sync/del_timer/ patch.

I didn't say I necessarily agreed that this was a good idea.  I just meant that
I agree that it will waste CPU.  You must still audit all uses of
cancel_delayed_work().

 Aha, now I see what you mean. However. Why the code above is better then
 
   cancel_delayed_work(afs_server_reaper);
   schedule_delayed_work(afs_server_reaper, 0);
 
 ? (I assume we already changed cancel_delayed_work() to use del_timer).

Because calling schedule_delayed_work() is a waste of CPU if the timer expiry
handler is currently running at this time as *that* is going to also schedule
the delayed work item.

 If delayed_work_timer_fn() is not running - both variants (let's denote them
 as 1 and 2) do the same.

Yes, but that's not the point.

 Now suppose that delayed_work_timer_fn() is running.
 
   1: lock_timer_base(), return -1, skip schedule_delayed_work().

   2: check timer_pending(), return 0, call schedule_delayed_work(),
  return immediately because test_and_set_bit(WORK_STRUCT_PENDING)
  fails.

I don't see what you're illustrating here.  Are these meant to be two steps in
a single process?  Or are they two alternate steps?

 So I still don't think try_to_del_timer_sync() can help in this particular
 case.

It permits us to avoid the test_and_set_bit() under some circumstances.

 To some extent, try_to_cancel_delayed_work is
 
   int try_to_cancel_delayed_work(dwork)
   {
   ret = cancel_delayed_work(dwork);
   if (!ret  work_pending(dwork-work))
   ret = -1;
   return ret;
   }
 
 iow, work_pending() looks like a more precise indication that work-func()
 is going to run soon.

Ah, but the timer routine may try to set the work item pending flag *after* the
work_pending() check you have here.  Furthermore, it would be better to avoid
the work_pending() check entirely because that check involves interacting with
atomic ops done on other CPUs.  try_to_del_timer_sync() returning -1 tells us
without a shadow of a doubt that the work item is either scheduled now or will
be scheduled very shortly, thus allowing us to avoid having to do it ourself.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-24 Thread David Howells
Oleg Nesterov [EMAIL PROTECTED] wrote:

 Sure, I'll grep for cancel_delayed_work(). But unless I missed something,
 this change should be completely transparent for all users. Otherwise, it
 is buggy.

I guess you will have to make sure that cancel_delayed_work() is always
followed by a flush of the workqueue, otherwise you might get this situation:

CPU 0   CPU 1
=== ===
timer expires
cancel_delayed_work(x) == 0 --delayed_work_timer_fn(x)
kfree(x);   --do_IRQ()
y = kmalloc(); // reuses x
--do_IRQ()
__queue_work(x)
--- OOPS ---

That's my main concern.  If you are certain that can't happen, then fair
enough.

Note that although you can call cancel_delayed_work() from within a work item
handler, you can't then follow it up with a flush as it's very likely to
deadlock.

  Because calling schedule_delayed_work() is a waste of CPU if the timer
  expiry handler is currently running at this time as *that* is going to
  also schedule the delayed work item.
 
 Yes. But otoh, try_to_del_timer_sync() is a waste of CPU compared to
 del_timer(), when the timer is not pending.

I suppose that's true.  As previously stated, my main objection to del_timer()
is the fact that it doesn't tell you if the timer expiry function is still
running.

Can you show me a patch illustrating exactly how you want to change
cancel_delayed_work()?  I can't remember whether you've done so already, but
if you have, I can't find it.  Is it basically this?:

 static inline int cancel_delayed_work(struct delayed_work *work)
 {
int ret;

-   ret = del_timer_sync(work-timer);
+   ret = del_timer(work-timer);
if (ret)
work_release(work-work);
return ret;
 }

I was thinking this situation might be a problem:

CPU 0   CPU 1
=== ===
timer expires
cancel_delayed_work(x) == 0 --delayed_work_timer_fn(x)
schedule_delayed_work(x,0)  --do_IRQ()
keventd scheduled
x-work()
--do_IRQ()
__queue_work(x)

But it won't, will it?

  Ah, but the timer routine may try to set the work item pending flag
  *after* the work_pending() check you have here.
 
 No, delayed_work_timer_fn() doesn't set the _PENDING flag.

Good point.  I don't think that's a problem because cancel_delayed_work()
won't clear the pending flag if it didn't remove a timer.

 First, this is very unlikely event, delayed_work_timer_fn() is very fast
 unless interrupted.

Yeah, I guess so.


Okay, you've convinced me, I think - provided you consider the case I
outlinded at the top of this email.

If you give me a patch to alter cancel_delayed_work(), I'll substitute it for
mine and use that that instead.  Dave Miller will just have to live with that
patch being there:-)

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-25 Thread David Howells
Oleg Nesterov [EMAIL PROTECTED] wrote:

 Yes sure. Note that this is documented:
 
   /*
* Kill off a pending schedule_delayed_work().  Note that the work 
 callback
* function may still be running on return from cancel_delayed_work().  
 Run
* flush_workqueue() or cancel_work_sync() to wait on it.
*/

No, it isn't documented.  It says that the *work* callback may be running, but
does not mention the timer callback.  However, just looking at the
cancellation function source made it clear that this would wait for the timer
handler to return first.


However, is it worth just making cancel_delayed_work() a void function and not
returning anything?  I'm not sure the return value is very useful.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-25 Thread David Howells
Oleg Nesterov [EMAIL PROTECTED] wrote:

 Ah yes, it says nothing about what the returned value means...

Yeah...  If you could amend that as part of your patch, that'd be great.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]

2007-04-25 Thread David Howells

The first of these patches together provide secure client-side RxRPC
connectivity as a Linux kernel socket family.  Only the RxRPC transport/session
side is supplied - the presentation side (marshalling the data) is left to the
client.  Copies of the patches can be found here:

http://people.redhat.com/~dhowells/rxrpc/series
http://people.redhat.com/~dhowells/rxrpc/01-move-skb-generic.diff
http://people.redhat.com/~dhowells/rxrpc/02-cancel_delayed_work.diff
http://people.redhat.com/~dhowells/rxrpc/03-keys.diff
http://people.redhat.com/~dhowells/rxrpc/04-timer-exports.diff
http://people.redhat.com/~dhowells/rxrpc/05-af_rxrpc.diff

Further patches make the in-kernel AFS filesystem use AF_RXRPC and delete the
old RxRPC implementation:

http://people.redhat.com/~dhowells/rxrpc/06-afs-cleanup.diff
http://people.redhat.com/~dhowells/rxrpc/07-af_rxrpc-kernel.diff
http://people.redhat.com/~dhowells/rxrpc/08-af_rxrpc-afs.diff
http://people.redhat.com/~dhowells/rxrpc/09-af_rxrpc-delete-old.diff

And then the rest of the patches extend AFS to provide automatic unmounting of
automount trees, security support and directory-level write support (create,
mkdir, etc.):

http://people.redhat.com/~dhowells/rxrpc/10-afs-multimount.diff
http://people.redhat.com/~dhowells/rxrpc/11-afs-security.diff
http://people.redhat.com/~dhowells/rxrpc/12-afs-doc.diff

http://people.redhat.com/~dhowells/rxrpc/13-netlink-support-MSG_TRUNC.diff
http://people.redhat.com/~dhowells/rxrpc/14-afs-get-capabilities.diff
http://people.redhat.com/~dhowells/rxrpc/15-afs-initcallbackstate3.diff
http://people.redhat.com/~dhowells/rxrpc/16-afs-dir-write-support.diff

Note that file-level write support is not yet complete and so is not included
in this patch set.


The userspace access methods make use of the control data passed to/by
sendmsg() and recvmsg().  See the three simple test programs:

http://people.redhat.com/~dhowells/rxrpc/klog.c
http://people.redhat.com/~dhowells/rxrpc/rxrpc.c
http://people.redhat.com/~dhowells/rxrpc/listen.c

The klog program is provided to go and get a Kerberos IV key from the AFS
kaserver.  Currently it must be edited before compiling to note the right
server IP address and the appropriate credentials.

These programs can be compiled by:

make klog rxrpc listen CFLAGS=-Wall -g LDLIBS=-lcrypto -lcrypt 
-lkrb4 -lkeyutils

Then a ticket can be obtained by:

./klog

If a security key is acquired in this way, then all subsequent AFS operations -
including VL lookups and mounts - performed with that session keyring will be
authenticated using that key.  The key can be viewed like so:

[EMAIL PROTECTED] ~]# keyctl show
Session Keyring
   -3 --alswrv  0 0  keyring: _ses.3268
2 --alswrv  0 0   \_ keyring: _uid.0
111416553 --als--v  0 0   \_ rxrpc: [EMAIL PROTECTED]

TODO:

 (*) Make certain parameters (such as connection timeouts) userspace
 configurable.

 (*) Make userspace utilities use it; librxrpc.

 (*) Userspace documentation.

 (*) KerberosV security.

Changes:

 (*) SOCK_RPC has been removed.  SOCK_DGRAM is now used instead.

 (*) I've add a facility whereby calls can be made to destinations other than
 the connect() address of a client socket by making use of msg_name in the
 msghdr struct when using sendmsg() to send the first data packet of a
 call.  Indeed, a client socket need not be connected before being used
 so.

 (*) I've also added a facility whereby client calls may also be made on
 server sockets, again by using msg_name in the msghdr struct.  In such a
 case, the server's local transport endpoint is used.

 (*) I've made the write buffer space check available to various callers
 (sk_write_space) and implemented poll support.

 (*) Rewrote rxrpc_recvmsg().  It now concatenates adjacent data messages from
 the same call when delivering them.

 (*) Updated the documentation to include notes on recvmsg, cover control
 messages and cover SOL_RXRPC-level socket options.

 (*) Provided an in-kernel interface to give in-kernel utilities easier access
 to the facility.

 (*) Made fs/afs/ use it.

 (*) Deleted the old contents of net/rxrpc/.

 (*) Use the scatterlist interface to the crypto API for now.  The patch that
 added the direct access interface conflicts with patches Herbert Xu is
 producing, so I've dropped it for the moment.

 (*) Moved a bug fix to make secure connection reuse work from the
 af_rxrpc-kernel patch to the af_rxrpc main patch.

 (*) Make RxRPC use its own private work queues rather than keventd's to avoid
 deadlocks when AFS tries to use keventd too.  This also puts encryption
 in the private work queue rather than keventd's queue as that might take
 a relatively long time to 

[PATCH 03/16] AF_RXRPC: Key facility changes for AF_RXRPC [try #3]

2007-04-25 Thread David Howells
Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/keys.txt  |   12 
 include/linux/key.h |2 ++
 security/keys/keyring.c |2 ++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d..81d9aa0 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents for more information.
void unregister_key_type(struct key_type *type);
 
 
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys.  The facility provides access to the keyring type for managing
+such a bundle:
+
+   struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings.  A keyring thus found can then be searched
+with keyring_search().  Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
 ===
 NOTES ON ACCESSING PAYLOAD CONTENTS
 ===
diff --git a/include/linux/key.h b/include/linux/key.h
index 169f05e..a9220e7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -160,6 +160,8 @@ struct key {
 */
union {
struct list_headlink;
+   unsigned long   x[2];
+   void*p[2];
} type_data;
 
/* key data
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index ad45ce7..88292e3 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -66,6 +66,8 @@ struct key_type key_type_keyring = {
.read   = keyring_read,
 };
 
+EXPORT_SYMBOL(key_type_keyring);
+
 /*
  * semaphore to serialise link/link calls to prevent two link calls in parallel
  * introducing a cycle

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/16] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code [try #3]

2007-04-25 Thread David Howells
Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can
use it too.

The kdoc comments I've attached to the functions needs to be checked by whoever
wrote them as I had to make some guesses about the workings of these functions.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/skbuff.h |6 ++
 include/net/esp.h  |2 -
 net/core/skbuff.c  |  188 
 net/xfrm/xfrm_algo.c   |  169 ---
 4 files changed, 194 insertions(+), 171 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5992f65..c905d42 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -83,6 +83,7 @@
  */
 
 struct net_device;
+struct scatterlist;
 
 #ifdef CONFIG_NETFILTER
 struct nf_conntrack {
@@ -361,6 +362,11 @@ extern struct sk_buff *skb_realloc_headroom(struct sk_buff 
*skb,
 extern struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
   int newheadroom, int newtailroom,
   gfp_t priority);
+extern intskb_to_sgvec(struct sk_buff *skb,
+   struct scatterlist *sg, int offset,
+   int len);
+extern intskb_cow_data(struct sk_buff *skb, int tailbits,
+   struct sk_buff **trailer);
 extern intskb_pad(struct sk_buff *skb, int pad);
 #define dev_kfree_skb(a)   kfree_skb(a)
 extern void  skb_over_panic(struct sk_buff *skb, int len,
diff --git a/include/net/esp.h b/include/net/esp.h
index 713d039..d05d8d2 100644
--- a/include/net/esp.h
+++ b/include/net/esp.h
@@ -40,8 +40,6 @@ struct esp_data
} auth;
 };
 
-extern int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int 
offset, int len);
-extern int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff 
**trailer);
 extern void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len);
 
 static inline int esp_mac_digest(struct esp_data *esp, struct sk_buff *skb,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 336958f..aa02bd4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -55,6 +55,7 @@
 #include linux/cache.h
 #include linux/rtnetlink.h
 #include linux/init.h
+#include linux/scatterlist.h
 
 #include net/protocol.h
 #include net/dst.h
@@ -2005,6 +2006,190 @@ void __init skb_init(void)
NULL, NULL);
 }
 
+/**
+ * skb_to_sgvec - Fill a scatter-gather list from a socket buffer
+ * @skb: Socket buffer containing the buffers to be mapped
+ * @sg: The scatter-gather list to map into
+ * @offset: The offset into the buffer's contents to start mapping
+ * @len: Length of buffer space to be mapped
+ *
+ * Fill the specified scatter-gather list with mappings/pointers into a
+ * region of the buffer space attached to a socket buffer.
+ */
+int
+skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
+{
+   int start = skb_headlen(skb);
+   int i, copy = start - offset;
+   int elt = 0;
+
+   if (copy  0) {
+   if (copy  len)
+   copy = len;
+   sg[elt].page = virt_to_page(skb-data + offset);
+   sg[elt].offset = (unsigned long)(skb-data + offset) % 
PAGE_SIZE;
+   sg[elt].length = copy;
+   elt++;
+   if ((len -= copy) == 0)
+   return elt;
+   offset += copy;
+   }
+
+   for (i = 0; i  skb_shinfo(skb)-nr_frags; i++) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + skb_shinfo(skb)-frags[i].size;
+   if ((copy = end - offset)  0) {
+   skb_frag_t *frag = skb_shinfo(skb)-frags[i];
+
+   if (copy  len)
+   copy = len;
+   sg[elt].page = frag-page;
+   sg[elt].offset = frag-page_offset+offset-start;
+   sg[elt].length = copy;
+   elt++;
+   if (!(len -= copy))
+   return elt;
+   offset += copy;
+   }
+   start = end;
+   }
+
+   if (skb_shinfo(skb)-frag_list) {
+   struct sk_buff *list = skb_shinfo(skb)-frag_list;
+
+   for (; list; list = list-next) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + list-len;
+   if ((copy = end - offset)  0) {
+   if (copy  len)
+   copy = len;
+   elt += skb_to_sgvec(list, sg+elt, offset - 
start, copy);
+   if ((len -= copy) == 0

[PATCH 02/16] cancel_delayed_work: use del_timer() instead of del_timer_sync() [try #3]

2007-04-25 Thread David Howells
del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
efficient since it locks the timer unconditionally, and may wait for the
completion of the delayed_work_timer_fn().

cancel_delayed_work() == 0 means:

before this patch:
work-func may still be running or queued

after this patch:
work-func may still be running or queued, or
delayed_work_timer_fn-__queue_work() in progress.

The latter doesn't differ from the caller's POV,
delayed_work_timer_fn() is called with _PENDING
bit set.

cancel_delayed_work() == 1 with this patch adds a new possibility:

delayed_work-work was cancelled, but delayed_work_timer_fn
is still running (this is only possible for the re-arming
works on single-threaded workqueue).

In this case the timer was re-started by work-func(), nobody
else can do this. This in turn means that delayed_work_timer_fn
has already passed __queue_work() (and wont't touch delayed_work)
because nobody else can queue delayed_work-work.

Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]
Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/workqueue.h |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2a7b38d..b8abfc7 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -191,14 +191,15 @@ int execute_in_process_context(work_func_t fn, struct 
execute_work *);
 
 /*
  * Kill off a pending schedule_delayed_work().  Note that the work callback
- * function may still be running on return from cancel_delayed_work().  Run
- * flush_scheduled_work() to wait on it.
+ * function may still be running on return from cancel_delayed_work(), unless
+ * it returns 1 and the work doesn't re-arm itself. Run flush_workqueue() or
+ * cancel_work_sync() to wait on it.
  */
 static inline int cancel_delayed_work(struct delayed_work *work)
 {
int ret;
 
-   ret = del_timer_sync(work-timer);
+   ret = del_timer(work-timer);
if (ret)
work_release(work-work);
return ret;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/16] AFS: Handle multiple mounts of an AFS superblock correctly [try #3]

2007-04-25 Thread David Howells
Handle multiple mounts of an AFS superblock correctly, checking to see whether
the superblock is already initialised after calling sget() rather than just
unconditionally stamping all over it.

Also delete the silent parameter to afs_fill_super() as it's not used and
can, in any case, be obtained from sb-s_flags.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/super.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/afs/super.c b/fs/afs/super.c
index efc4fe6..77e6875 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -212,7 +212,7 @@ static int afs_test_super(struct super_block *sb, void 
*data)
 /*
  * fill in the superblock
  */
-static int afs_fill_super(struct super_block *sb, void *data, int silent)
+static int afs_fill_super(struct super_block *sb, void *data)
 {
struct afs_mount_params *params = data;
struct afs_super_info *as = NULL;
@@ -319,17 +319,23 @@ static int afs_get_sb(struct file_system_type *fs_type,
goto error;
}
 
-   sb-s_flags = flags;
-
-   ret = afs_fill_super(sb, params, flags  MS_SILENT ? 1 : 0);
-   if (ret  0) {
-   up_write(sb-s_umount);
-   deactivate_super(sb);
-   goto error;
+   if (!sb-s_root) {
+   /* initial superblock/root creation */
+   _debug(create);
+   sb-s_flags = flags;
+   ret = afs_fill_super(sb, params);
+   if (ret  0) {
+   up_write(sb-s_umount);
+   deactivate_super(sb);
+   goto error;
+   }
+   sb-s_flags |= MS_ACTIVE;
+   } else {
+   _debug(reuse);
+   ASSERTCMP(sb-s_flags, , MS_ACTIVE);
}
-   sb-s_flags |= MS_ACTIVE;
-   simple_set_mnt(mnt, sb);
 
+   simple_set_mnt(mnt, sb);
afs_put_volume(params.volume);
afs_put_cell(params.default_cell);
_leave( = 0 [%p], sb);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/16] commit ad495d7b6cfcd1bc2eaf06c42699be0bb5d84234 [try #3]

2007-04-25 Thread David Howells
[NETLINK]: Mirror UDP MSG_TRUNC semantics.

If the user passes MSG_TRUNC in via msg_flags, return
the full packet size not the truncated size.

Idea from Herbert Xu and Thomas Graf.

Signed-off-by: David S. Miller [EMAIL PROTECTED]
---

 net/netlink/af_netlink.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c48b0f4..5890210 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1242,6 +1242,9 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct 
socket *sock,
 
scm_recv(sock, msg, siocb-scm, flags);
 
+   if (flags  MSG_TRUNC)
+   copied = skb-len;
+
 out:
netlink_rcv_wake(sk);
return err ? : copied;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/16] AFS: Implement the CB.InitCallBackState3 operation [try #3]

2007-04-25 Thread David Howells
Implement the CB.InitCallBackState3 operation for the fileserver to call.
This reduces the amount of network traffic because if this op is aborted, the
fileserver will then attempt an CB.InitCallBackState operation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/AFS_CM.h|1 +
 fs/afs/cmservice.c |   46 ++
 2 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/fs/afs/AFS_CM.h b/fs/afs/AFS_CM.h
index d4bd201..7b4d4fa 100644
--- a/fs/afs/AFS_CM.h
+++ b/fs/afs/AFS_CM.h
@@ -23,6 +23,7 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBInitCallBackState3= 213,  /* initialise callback state, version 3 
*/
CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 5139723..3d58861 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -20,6 +20,8 @@ struct workqueue_struct *afs_cm_workqueue;
 
 static int afs_deliver_cb_init_call_back_state(struct afs_call *,
   struct sk_buff *, bool);
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *,
+   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
@@ -47,6 +49,16 @@ static const struct afs_call_type afs_SRXCBInitCallBackState 
= {
 };
 
 /*
+ * CB.InitCallBackState3 operation type
+ */
+static const struct afs_call_type afs_SRXCBInitCallBackState3 = {
+   .name   = CB.InitCallBackState3,
+   .deliver= afs_deliver_cb_init_call_back_state3,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * CB.Probe operation type
  */
 static const struct afs_call_type afs_SRXCBProbe = {
@@ -83,6 +95,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBInitCallBackState:
call-type = afs_SRXCBInitCallBackState;
return true;
+   case CBInitCallBackState3:
+   call-type = afs_SRXCBInitCallBackState3;
+   return true;
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
@@ -312,6 +327,37 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *call,
 }
 
 /*
+ * deliver request data to a CB.InitCallBackState3 call
+ */
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *call,
+   struct sk_buff *skb,
+   bool last)
+{
+   struct afs_server *server;
+   struct in_addr addr;
+
+   _enter(,{%u},%d, skb-len, last);
+
+   if (!last)
+   return 0;
+
+   /* no unmarshalling required */
+   call-state = AFS_CALL_REPLYING;
+
+   /* we'll need the file server record as that tells us which set of
+* vnodes to operate upon */
+   memcpy(addr, skb-nh.iph-saddr, 4);
+   server = afs_find_server(addr);
+   if (!server)
+   return -ENOTCONN;
+   call-server = server;
+
+   INIT_WORK(call-work, SRXAFSCB_InitCallBackState);
+   schedule_work(call-work);
+   return 0;
+}
+
+/*
  * allow the fileserver to see if the cache manager is still alive
  */
 static void SRXAFSCB_Probe(struct work_struct *work)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/16] AFS: Add support for the CB.GetCapabilities operation [try #3]

2007-04-25 Thread David Howells
Add support for the CB.GetCapabilities operation with which the fileserver can
ask the client for the following information:

 (1) The list of network interfaces it has available as IPv4 address + netmask
 plus the MTUs.

 (2) The client's UUID.

 (3) The extended capabilities of the client, for which the only current one
 is unified error mapping (abort code interpretation).

To support this, the patch adds the following routines to AFS:

 (1) A function to iterate through all the network interfaces using RTNETLINK
 to extract IPv4 addresses and MTUs.

 (2) A function to iterate through all the network interfaces using RTNETLINK
 to pull out the MAC address of the lowest index interface to use in UUID
 construction.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/AFS_CM.h|3 
 fs/afs/Makefile|1 
 fs/afs/cmservice.c |   98 ++
 fs/afs/internal.h  |   42 
 fs/afs/main.c  |   49 +
 fs/afs/rxrpc.c |   39 
 fs/afs/use-rtnetlink.c |  473 
 7 files changed, 705 insertions(+), 0 deletions(-)

diff --git a/fs/afs/AFS_CM.h b/fs/afs/AFS_CM.h
index 7c8e3d4..d4bd201 100644
--- a/fs/afs/AFS_CM.h
+++ b/fs/afs/AFS_CM.h
@@ -23,6 +23,9 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
+#define AFS_CAP_ERROR_TRANSLATION  0x1
+
 #endif /* AFS_FS_H */
diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index cca198b..01545eb 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -18,6 +18,7 @@ kafs-objs := \
security.o \
server.o \
super.o \
+   use-rtnetlink.o \
vlclient.o \
vlocation.o \
vnode.o \
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 9cb3ac5..5139723 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -22,6 +22,8 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *,
   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
+static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
+  bool);
 static void afs_cm_destructor(struct afs_call *);
 
 /*
@@ -55,6 +57,16 @@ static const struct afs_call_type afs_SRXCBProbe = {
 };
 
 /*
+ * CB.GetCapabilities operation type
+ */
+static const struct afs_call_type afs_SRXCBGetCapabilites = {
+   .name   = CB.GetCapabilities,
+   .deliver= afs_deliver_cb_get_capabilities,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * route an incoming cache manager call
  * - return T if supported, F if not
  */
@@ -74,6 +86,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
+   case CBGetCapabilities:
+   call-type = afs_SRXCBGetCapabilites;
+   return true;
default:
return false;
}
@@ -328,3 +343,86 @@ static int afs_deliver_cb_probe(struct afs_call *call, 
struct sk_buff *skb,
schedule_work(call-work);
return 0;
 }
+
+/*
+ * allow the fileserver to ask about the cache manager's capabilities
+ */
+static void SRXAFSCB_GetCapabilities(struct work_struct *work)
+{
+   struct afs_interface *ifs;
+   struct afs_call *call = container_of(work, struct afs_call, work);
+   int loop, nifs;
+
+   struct {
+   struct /* InterfaceAddr */ {
+   __be32 nifs;
+   __be32 uuid[11];
+   __be32 ifaddr[32];
+   __be32 netmask[32];
+   __be32 mtu[32];
+   } ia;
+   struct /* Capabilities */ {
+   __be32 capcount;
+   __be32 caps[1];
+   } cap;
+   } reply;
+
+   _enter();
+
+   nifs = 0;
+   ifs = kcalloc(32, sizeof(*ifs), GFP_KERNEL);
+   if (ifs) {
+   nifs = afs_get_ipv4_interfaces(ifs, 32, false);
+   if (nifs  0) {
+   kfree(ifs);
+   ifs = NULL;
+   nifs = 0;
+   }
+   }
+
+   memset(reply, 0, sizeof(reply));
+   reply.ia.nifs = htonl(nifs);
+
+   reply.ia.uuid[0] = htonl(afs_uuid.time_low);
+   reply.ia.uuid[1] = htonl(afs_uuid.time_mid);
+   reply.ia.uuid[2] = htonl(afs_uuid.time_hi_and_version);
+   reply.ia.uuid[3] = htonl((s8

[PATCH 12/16] AFS: Update the AFS fs documentation [try #3]

2007-04-25 Thread David Howells
Update the AFS fs documentation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/filesystems/afs.txt |  214 +++--
 1 files changed, 154 insertions(+), 60 deletions(-)

diff --git a/Documentation/filesystems/afs.txt 
b/Documentation/filesystems/afs.txt
index 2f4237d..12ad6c7 100644
--- a/Documentation/filesystems/afs.txt
+++ b/Documentation/filesystems/afs.txt
@@ -1,31 +1,82 @@
+
 kAFS: AFS FILESYSTEM
 
 
-ABOUT
-=
+Contents:
+
+ - Overview.
+ - Usage.
+ - Mountpoints.
+ - Proc filesystem.
+ - The cell database.
+ - Security.
+ - Examples.
+
+
+
+OVERVIEW
+
 
-This filesystem provides a fairly simple AFS filesystem driver. It is under
-development and only provides very basic facilities. It does not yet support
-the following AFS features:
+This filesystem provides a fairly simple secure AFS filesystem driver. It is
+under development and does not yet provide the full feature set.  The features
+it does support include:
 
-   (*) Write support.
-   (*) Communications security.
-   (*) Local caching.
-   (*) pioctl() system call.
-   (*) Automatic mounting of embedded mountpoints.
+ (*) Security (currently only AFS kaserver and KerberosIV tickets).
 
+ (*) File reading.
 
+ (*) Automounting.
+
+It does not yet support the following AFS features:
+
+ (*) Write support.
+
+ (*) Local caching.
+
+ (*) pioctl() system call.
+
+
+===
+COMPILATION
+===
+
+The filesystem should be enabled by turning on the kernel configuration
+options:
+
+   CONFIG_AF_RXRPC - The RxRPC protocol transport
+   CONFIG_RXKAD- The RxRPC Kerberos security handler
+   CONFIG_AFS  - The AFS filesystem
+
+Additionally, the following can be turned on to aid debugging:
+
+   CONFIG_AF_RXRPC_DEBUG   - Permit AF_RXRPC debugging to be enabled
+   CONFIG_AFS_DEBUG- Permit AFS debugging to be enabled
+
+They permit the debugging messages to be turned on dynamically by manipulating
+the masks in the following files:
+
+   /sys/module/af_rxrpc/parameters/debug
+   /sys/module/afs/parameters/debug
+
+
+=
 USAGE
 =
 
 When inserting the driver modules the root cell must be specified along with a
 list of volume location server IP addresses:
 
-   insmod rxrpc.o
+   insmod af_rxrpc.o
+   insmod rxkad.o
insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
 
-The first module is a driver for the RxRPC remote operation protocol, and the
-second is the actual filesystem driver for the AFS filesystem.
+The first module is the AF_RXRPC network protocol driver.  This provides the
+RxRPC remote operation protocol and may also be accessed from userspace.  See:
+
+   Documentation/networking/rxrpc.txt
+
+The second module is the kerberos RxRPC security driver, and the third module
+is the actual filesystem driver for the AFS filesystem.
 
 Once the module has been loaded, more modules can be added by the following
 procedure:
@@ -33,7 +84,7 @@ procedure:
echo add grand.central.org 18.7.14.88:128.2.191.224 /proc/fs/afs/cells
 
 Where the parameters to the add command are the name of a cell and a list of
-volume location servers within that cell.
+volume location servers within that cell, with the latter separated by colons.
 
 Filesystems can be mounted anywhere by commands similar to the following:
 
@@ -42,11 +93,6 @@ Filesystems can be mounted anywhere by commands similar to 
the following:
mount -t afs #root.afs. /afs
mount -t afs #root.cell. /afs/cambridge
 
-  NB: When using this on Linux 2.4, the mount command has to be different,
-  since the filesystem doesn't have access to the device name argument:
-
-   mount -t afs none /afs -ovol=#root.afs.
-
 Where the initial character is either a hash or a percent symbol depending on
 whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O
 volume, but are willing to use a R/W volume instead (percent).
@@ -60,55 +106,66 @@ named volume will be looked up in the cell specified 
during insmod.
 Additional cells can be added through /proc (see later section).
 
 
+===
 MOUNTPOINTS
 ===
 
-AFS has a concept of mountpoints. These are specially formatted symbolic links
-(of the same form as the device name passed to mount). kAFS presents these
-to the user as directories that have special properties:
+AFS has a concept of mountpoints. In AFS terms, these are specially formatted
+symbolic links (of the same form as the device name passed to mount).  kAFS
+presents these to the user as directories that have a follow-link capability
+(ie: symbolic link semantics).  If anyone attempts to access them, they will
+automatically cause the target volume to be mounted (if possible) on that site.
 
-  (*) They cannot

Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]

2007-04-25 Thread David Howells
Andrew Morton [EMAIL PROTECTED] wrote:

 I'm ducking all feature and cleanup patches now, and probably shall
 continue to do so for some weeks.  The priority (which I believe to be
 increasingly urgent) is to fix the 2.6.21 regressions and to stabilise
 the things which we presently have queued for 2.6.22.  Not to
 mention the 1000ish unaddressed bug reports in bugzilla and elsewhere.

Fair enough.  I think the idea is for them (or at least some of them) to go
through one of DaveM's net git trees anyway.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-25 Thread David Howells

David Miller [EMAIL PROTECTED] wrote:

 Is it possible for your changes to be purely networking
 and not need those changes outside of the networking?

See my latest patchset release.  I've reduced the dependencies on
non-networking changes to:

 (1) Oleg Nesterov's patch to change cancel_delayed_work() to use del_timer()
 rather than del_timer_sync() [patch 02/16].

 This patch can be discarded without compilation failure at the expense of
 making AFS slightly less efficient. It also makes AF_RXRPC slightly less
 efficient, but only in the rmmod path.

 (2) A symbol export in the keyring stuff plus a proliferation of the types
 available in the struct key::type_data union [patch 03/16].  This does
 not conflict with any other patches that I know about.

 (3) A symbol export in the timer stuff [patch 04/16].

Everything else that remains after the reduction is confined to the AF_RXRPC
or AFS code, save for a couple of networking patches in my patchset that you
already have and I just need to make the thing compile.

I'm not sure that I can make the AF_RXRPC patches totally independent of the
AFS patches as the two sets need to interleave since the last AF_RXRPC patch
deletes the old RxRPC code - which the old AFS code depends on.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #4]

2007-04-25 Thread David Howells

The first of these patches together provide secure client-side RxRPC
connectivity as a Linux kernel socket family.  Only the RxRPC transport/session
side is supplied - the presentation side (marshalling the data) is left to the
client.  Copies of the patches can be found here:

http://people.redhat.com/~dhowells/rxrpc/series
http://people.redhat.com/~dhowells/rxrpc/01-move-skb-generic.diff
http://people.redhat.com/~dhowells/rxrpc/02-cancel_delayed_work.diff
http://people.redhat.com/~dhowells/rxrpc/03-keys.diff
http://people.redhat.com/~dhowells/rxrpc/04-timer-exports.diff
http://people.redhat.com/~dhowells/rxrpc/05-af_rxrpc.diff

Further patches make the in-kernel AFS filesystem use AF_RXRPC and delete the
old RxRPC implementation:

http://people.redhat.com/~dhowells/rxrpc/06-afs-cleanup.diff
http://people.redhat.com/~dhowells/rxrpc/07-af_rxrpc-kernel.diff
http://people.redhat.com/~dhowells/rxrpc/08-af_rxrpc-afs.diff
http://people.redhat.com/~dhowells/rxrpc/09-af_rxrpc-delete-old.diff

And then the rest of the patches extend AFS to provide automatic unmounting of
automount trees, security support and directory-level write support (create,
mkdir, etc.):

http://people.redhat.com/~dhowells/rxrpc/10-afs-multimount.diff
http://people.redhat.com/~dhowells/rxrpc/11-afs-security.diff
http://people.redhat.com/~dhowells/rxrpc/12-afs-doc.diff

http://people.redhat.com/~dhowells/rxrpc/13-netlink-support-MSG_TRUNC.diff
http://people.redhat.com/~dhowells/rxrpc/14-afs-get-capabilities.diff
http://people.redhat.com/~dhowells/rxrpc/15-afs-initcallbackstate3.diff
http://people.redhat.com/~dhowells/rxrpc/16-afs-dir-write-support.diff

Note that file-level write support is not yet complete and so is not included
in this patch set.


The userspace access methods make use of the control data passed to/by
sendmsg() and recvmsg().  See the three simple test programs:

http://people.redhat.com/~dhowells/rxrpc/klog.c
http://people.redhat.com/~dhowells/rxrpc/rxrpc.c
http://people.redhat.com/~dhowells/rxrpc/listen.c

The klog program is provided to go and get a Kerberos IV key from the AFS
kaserver.  Currently it must be edited before compiling to note the right
server IP address and the appropriate credentials.

These programs can be compiled by:

make klog rxrpc listen CFLAGS=-Wall -g LDLIBS=-lcrypto -lcrypt 
-lkrb4 -lkeyutils

Then a ticket can be obtained by:

./klog

If a security key is acquired in this way, then all subsequent AFS operations -
including VL lookups and mounts - performed with that session keyring will be
authenticated using that key.  The key can be viewed like so:

[EMAIL PROTECTED] ~]# keyctl show
Session Keyring
   -3 --alswrv  0 0  keyring: _ses.3268
2 --alswrv  0 0   \_ keyring: _uid.0
111416553 --als--v  0 0   \_ rxrpc: [EMAIL PROTECTED]

TODO:

 (*) Make certain parameters (such as connection timeouts) userspace
 configurable.

 (*) Make userspace utilities use it; librxrpc.

 (*) Userspace documentation.

 (*) KerberosV security.

Changes:

 (*) SOCK_RPC has been removed.  SOCK_DGRAM is now used instead.

 (*) I've add a facility whereby calls can be made to destinations other than
 the connect() address of a client socket by making use of msg_name in the
 msghdr struct when using sendmsg() to send the first data packet of a
 call.  Indeed, a client socket need not be connected before being used
 so.

 (*) I've also added a facility whereby client calls may also be made on
 server sockets, again by using msg_name in the msghdr struct.  In such a
 case, the server's local transport endpoint is used.

 (*) I've made the write buffer space check available to various callers
 (sk_write_space) and implemented poll support.

 (*) Rewrote rxrpc_recvmsg().  It now concatenates adjacent data messages from
 the same call when delivering them.

 (*) Updated the documentation to include notes on recvmsg, cover control
 messages and cover SOL_RXRPC-level socket options.

 (*) Provided an in-kernel interface to give in-kernel utilities easier access
 to the facility.

 (*) Made fs/afs/ use it.

 (*) Deleted the old contents of net/rxrpc/.

 (*) Use the scatterlist interface to the crypto API for now.  The patch that
 added the direct access interface conflicts with patches Herbert Xu is
 producing, so I've dropped it for the moment.

 (*) Moved a bug fix to make secure connection reuse work from the
 af_rxrpc-kernel patch to the af_rxrpc main patch.

 (*) Make RxRPC use its own private work queues rather than keventd's to avoid
 deadlocks when AFS tries to use keventd too.  This also puts encryption
 in the private work queue rather than keventd's queue as that might take
 a relatively long time to 

[PATCH 02/16] cancel_delayed_work: use del_timer() instead of del_timer_sync() [try #4]

2007-04-25 Thread David Howells
del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
efficient since it locks the timer unconditionally, and may wait for the
completion of the delayed_work_timer_fn().

cancel_delayed_work() == 0 means:

before this patch:
work-func may still be running or queued

after this patch:
work-func may still be running or queued, or
delayed_work_timer_fn-__queue_work() in progress.

The latter doesn't differ from the caller's POV,
delayed_work_timer_fn() is called with _PENDING
bit set.

cancel_delayed_work() == 1 with this patch adds a new possibility:

delayed_work-work was cancelled, but delayed_work_timer_fn
is still running (this is only possible for the re-arming
works on single-threaded workqueue).

In this case the timer was re-started by work-func(), nobody
else can do this. This in turn means that delayed_work_timer_fn
has already passed __queue_work() (and wont't touch delayed_work)
because nobody else can queue delayed_work-work.

Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]
Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/workqueue.h |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2a7b38d..b8abfc7 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -191,14 +191,15 @@ int execute_in_process_context(work_func_t fn, struct 
execute_work *);
 
 /*
  * Kill off a pending schedule_delayed_work().  Note that the work callback
- * function may still be running on return from cancel_delayed_work().  Run
- * flush_scheduled_work() to wait on it.
+ * function may still be running on return from cancel_delayed_work(), unless
+ * it returns 1 and the work doesn't re-arm itself. Run flush_workqueue() or
+ * cancel_work_sync() to wait on it.
  */
 static inline int cancel_delayed_work(struct delayed_work *work)
 {
int ret;
 
-   ret = del_timer_sync(work-timer);
+   ret = del_timer(work-timer);
if (ret)
work_release(work-work);
return ret;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/16] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code [try #4]

2007-04-25 Thread David Howells
Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can
use it too.

The kdoc comments I've attached to the functions needs to be checked by whoever
wrote them as I had to make some guesses about the workings of these functions.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/skbuff.h |6 ++
 include/net/esp.h  |2 -
 net/core/skbuff.c  |  188 
 net/xfrm/xfrm_algo.c   |  169 ---
 4 files changed, 194 insertions(+), 171 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5992f65..c905d42 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -83,6 +83,7 @@
  */
 
 struct net_device;
+struct scatterlist;
 
 #ifdef CONFIG_NETFILTER
 struct nf_conntrack {
@@ -361,6 +362,11 @@ extern struct sk_buff *skb_realloc_headroom(struct sk_buff 
*skb,
 extern struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
   int newheadroom, int newtailroom,
   gfp_t priority);
+extern intskb_to_sgvec(struct sk_buff *skb,
+   struct scatterlist *sg, int offset,
+   int len);
+extern intskb_cow_data(struct sk_buff *skb, int tailbits,
+   struct sk_buff **trailer);
 extern intskb_pad(struct sk_buff *skb, int pad);
 #define dev_kfree_skb(a)   kfree_skb(a)
 extern void  skb_over_panic(struct sk_buff *skb, int len,
diff --git a/include/net/esp.h b/include/net/esp.h
index 713d039..d05d8d2 100644
--- a/include/net/esp.h
+++ b/include/net/esp.h
@@ -40,8 +40,6 @@ struct esp_data
} auth;
 };
 
-extern int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int 
offset, int len);
-extern int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff 
**trailer);
 extern void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len);
 
 static inline int esp_mac_digest(struct esp_data *esp, struct sk_buff *skb,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 336958f..aa02bd4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -55,6 +55,7 @@
 #include linux/cache.h
 #include linux/rtnetlink.h
 #include linux/init.h
+#include linux/scatterlist.h
 
 #include net/protocol.h
 #include net/dst.h
@@ -2005,6 +2006,190 @@ void __init skb_init(void)
NULL, NULL);
 }
 
+/**
+ * skb_to_sgvec - Fill a scatter-gather list from a socket buffer
+ * @skb: Socket buffer containing the buffers to be mapped
+ * @sg: The scatter-gather list to map into
+ * @offset: The offset into the buffer's contents to start mapping
+ * @len: Length of buffer space to be mapped
+ *
+ * Fill the specified scatter-gather list with mappings/pointers into a
+ * region of the buffer space attached to a socket buffer.
+ */
+int
+skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
+{
+   int start = skb_headlen(skb);
+   int i, copy = start - offset;
+   int elt = 0;
+
+   if (copy  0) {
+   if (copy  len)
+   copy = len;
+   sg[elt].page = virt_to_page(skb-data + offset);
+   sg[elt].offset = (unsigned long)(skb-data + offset) % 
PAGE_SIZE;
+   sg[elt].length = copy;
+   elt++;
+   if ((len -= copy) == 0)
+   return elt;
+   offset += copy;
+   }
+
+   for (i = 0; i  skb_shinfo(skb)-nr_frags; i++) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + skb_shinfo(skb)-frags[i].size;
+   if ((copy = end - offset)  0) {
+   skb_frag_t *frag = skb_shinfo(skb)-frags[i];
+
+   if (copy  len)
+   copy = len;
+   sg[elt].page = frag-page;
+   sg[elt].offset = frag-page_offset+offset-start;
+   sg[elt].length = copy;
+   elt++;
+   if (!(len -= copy))
+   return elt;
+   offset += copy;
+   }
+   start = end;
+   }
+
+   if (skb_shinfo(skb)-frag_list) {
+   struct sk_buff *list = skb_shinfo(skb)-frag_list;
+
+   for (; list; list = list-next) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + list-len;
+   if ((copy = end - offset)  0) {
+   if (copy  len)
+   copy = len;
+   elt += skb_to_sgvec(list, sg+elt, offset - 
start, copy);
+   if ((len -= copy) == 0

[PATCH 03/16] AF_RXRPC: Key facility changes for AF_RXRPC [try #4]

2007-04-25 Thread David Howells
Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/keys.txt  |   12 
 include/linux/key.h |2 ++
 security/keys/keyring.c |2 ++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d..81d9aa0 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents for more information.
void unregister_key_type(struct key_type *type);
 
 
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys.  The facility provides access to the keyring type for managing
+such a bundle:
+
+   struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings.  A keyring thus found can then be searched
+with keyring_search().  Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
 ===
 NOTES ON ACCESSING PAYLOAD CONTENTS
 ===
diff --git a/include/linux/key.h b/include/linux/key.h
index 169f05e..a9220e7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -160,6 +160,8 @@ struct key {
 */
union {
struct list_headlink;
+   unsigned long   x[2];
+   void*p[2];
} type_data;
 
/* key data
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index ad45ce7..88292e3 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -66,6 +66,8 @@ struct key_type key_type_keyring = {
.read   = keyring_read,
 };
 
+EXPORT_SYMBOL(key_type_keyring);
+
 /*
  * semaphore to serialise link/link calls to prevent two link calls in parallel
  * introducing a cycle

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/16] AFS: Handle multiple mounts of an AFS superblock correctly [try #4]

2007-04-25 Thread David Howells
Handle multiple mounts of an AFS superblock correctly, checking to see whether
the superblock is already initialised after calling sget() rather than just
unconditionally stamping all over it.

Also delete the silent parameter to afs_fill_super() as it's not used and
can, in any case, be obtained from sb-s_flags.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/super.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/afs/super.c b/fs/afs/super.c
index efc4fe6..77e6875 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -212,7 +212,7 @@ static int afs_test_super(struct super_block *sb, void 
*data)
 /*
  * fill in the superblock
  */
-static int afs_fill_super(struct super_block *sb, void *data, int silent)
+static int afs_fill_super(struct super_block *sb, void *data)
 {
struct afs_mount_params *params = data;
struct afs_super_info *as = NULL;
@@ -319,17 +319,23 @@ static int afs_get_sb(struct file_system_type *fs_type,
goto error;
}
 
-   sb-s_flags = flags;
-
-   ret = afs_fill_super(sb, params, flags  MS_SILENT ? 1 : 0);
-   if (ret  0) {
-   up_write(sb-s_umount);
-   deactivate_super(sb);
-   goto error;
+   if (!sb-s_root) {
+   /* initial superblock/root creation */
+   _debug(create);
+   sb-s_flags = flags;
+   ret = afs_fill_super(sb, params);
+   if (ret  0) {
+   up_write(sb-s_umount);
+   deactivate_super(sb);
+   goto error;
+   }
+   sb-s_flags |= MS_ACTIVE;
+   } else {
+   _debug(reuse);
+   ASSERTCMP(sb-s_flags, , MS_ACTIVE);
}
-   sb-s_flags |= MS_ACTIVE;
-   simple_set_mnt(mnt, sb);
 
+   simple_set_mnt(mnt, sb);
afs_put_volume(params.volume);
afs_put_cell(params.default_cell);
_leave( = 0 [%p], sb);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/16] AFS: Implement the CB.InitCallBackState3 operation [try #4]

2007-04-25 Thread David Howells
Implement the CB.InitCallBackState3 operation for the fileserver to call.
This reduces the amount of network traffic because if this op is aborted, the
fileserver will then attempt an CB.InitCallBackState operation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/afs_cm.h|1 +
 fs/afs/cmservice.c |   46 ++
 2 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/fs/afs/afs_cm.h b/fs/afs/afs_cm.h
index d4bd201..7b4d4fa 100644
--- a/fs/afs/afs_cm.h
+++ b/fs/afs/afs_cm.h
@@ -23,6 +23,7 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBInitCallBackState3= 213,  /* initialise callback state, version 3 
*/
CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index f8ad36b..32deb04 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -20,6 +20,8 @@ struct workqueue_struct *afs_cm_workqueue;
 
 static int afs_deliver_cb_init_call_back_state(struct afs_call *,
   struct sk_buff *, bool);
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *,
+   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
@@ -47,6 +49,16 @@ static const struct afs_call_type afs_SRXCBInitCallBackState 
= {
 };
 
 /*
+ * CB.InitCallBackState3 operation type
+ */
+static const struct afs_call_type afs_SRXCBInitCallBackState3 = {
+   .name   = CB.InitCallBackState3,
+   .deliver= afs_deliver_cb_init_call_back_state3,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * CB.Probe operation type
  */
 static const struct afs_call_type afs_SRXCBProbe = {
@@ -83,6 +95,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBInitCallBackState:
call-type = afs_SRXCBInitCallBackState;
return true;
+   case CBInitCallBackState3:
+   call-type = afs_SRXCBInitCallBackState3;
+   return true;
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
@@ -312,6 +327,37 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *call,
 }
 
 /*
+ * deliver request data to a CB.InitCallBackState3 call
+ */
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *call,
+   struct sk_buff *skb,
+   bool last)
+{
+   struct afs_server *server;
+   struct in_addr addr;
+
+   _enter(,{%u},%d, skb-len, last);
+
+   if (!last)
+   return 0;
+
+   /* no unmarshalling required */
+   call-state = AFS_CALL_REPLYING;
+
+   /* we'll need the file server record as that tells us which set of
+* vnodes to operate upon */
+   memcpy(addr, skb-nh.iph-saddr, 4);
+   server = afs_find_server(addr);
+   if (!server)
+   return -ENOTCONN;
+   call-server = server;
+
+   INIT_WORK(call-work, SRXAFSCB_InitCallBackState);
+   schedule_work(call-work);
+   return 0;
+}
+
+/*
  * allow the fileserver to see if the cache manager is still alive
  */
 static void SRXAFSCB_Probe(struct work_struct *work)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/16] commit ad495d7b6cfcd1bc2eaf06c42699be0bb5d84234 [try #4]

2007-04-25 Thread David Howells
[NETLINK]: Mirror UDP MSG_TRUNC semantics.

If the user passes MSG_TRUNC in via msg_flags, return
the full packet size not the truncated size.

Idea from Herbert Xu and Thomas Graf.

Signed-off-by: David S. Miller [EMAIL PROTECTED]
---

 net/netlink/af_netlink.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c48b0f4..5890210 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1242,6 +1242,9 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct 
socket *sock,
 
scm_recv(sock, msg, siocb-scm, flags);
 
+   if (flags  MSG_TRUNC)
+   copied = skb-len;
+
 out:
netlink_rcv_wake(sk);
return err ? : copied;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/16] AFS: Add support for the CB.GetCapabilities operation [try #4]

2007-04-25 Thread David Howells
Add support for the CB.GetCapabilities operation with which the fileserver can
ask the client for the following information:

 (1) The list of network interfaces it has available as IPv4 address + netmask
 plus the MTUs.

 (2) The client's UUID.

 (3) The extended capabilities of the client, for which the only current one
 is unified error mapping (abort code interpretation).

To support this, the patch adds the following routines to AFS:

 (1) A function to iterate through all the network interfaces using RTNETLINK
 to extract IPv4 addresses and MTUs.

 (2) A function to iterate through all the network interfaces using RTNETLINK
 to pull out the MAC address of the lowest index interface to use in UUID
 construction.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/Makefile|1 
 fs/afs/afs_cm.h|3 
 fs/afs/cmservice.c |   98 ++
 fs/afs/internal.h  |   42 
 fs/afs/main.c  |   49 +
 fs/afs/rxrpc.c |   39 
 fs/afs/use-rtnetlink.c |  473 
 7 files changed, 705 insertions(+), 0 deletions(-)

diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index cca198b..01545eb 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -18,6 +18,7 @@ kafs-objs := \
security.o \
server.o \
super.o \
+   use-rtnetlink.o \
vlclient.o \
vlocation.o \
vnode.o \
diff --git a/fs/afs/afs_cm.h b/fs/afs/afs_cm.h
index 7c8e3d4..d4bd201 100644
--- a/fs/afs/afs_cm.h
+++ b/fs/afs/afs_cm.h
@@ -23,6 +23,9 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
+#define AFS_CAP_ERROR_TRANSLATION  0x1
+
 #endif /* AFS_FS_H */
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 7e184bb..f8ad36b 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -22,6 +22,8 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *,
   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
+static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
+  bool);
 static void afs_cm_destructor(struct afs_call *);
 
 /*
@@ -55,6 +57,16 @@ static const struct afs_call_type afs_SRXCBProbe = {
 };
 
 /*
+ * CB.GetCapabilities operation type
+ */
+static const struct afs_call_type afs_SRXCBGetCapabilites = {
+   .name   = CB.GetCapabilities,
+   .deliver= afs_deliver_cb_get_capabilities,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * route an incoming cache manager call
  * - return T if supported, F if not
  */
@@ -74,6 +86,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
+   case CBGetCapabilities:
+   call-type = afs_SRXCBGetCapabilites;
+   return true;
default:
return false;
}
@@ -328,3 +343,86 @@ static int afs_deliver_cb_probe(struct afs_call *call, 
struct sk_buff *skb,
schedule_work(call-work);
return 0;
 }
+
+/*
+ * allow the fileserver to ask about the cache manager's capabilities
+ */
+static void SRXAFSCB_GetCapabilities(struct work_struct *work)
+{
+   struct afs_interface *ifs;
+   struct afs_call *call = container_of(work, struct afs_call, work);
+   int loop, nifs;
+
+   struct {
+   struct /* InterfaceAddr */ {
+   __be32 nifs;
+   __be32 uuid[11];
+   __be32 ifaddr[32];
+   __be32 netmask[32];
+   __be32 mtu[32];
+   } ia;
+   struct /* Capabilities */ {
+   __be32 capcount;
+   __be32 caps[1];
+   } cap;
+   } reply;
+
+   _enter();
+
+   nifs = 0;
+   ifs = kcalloc(32, sizeof(*ifs), GFP_KERNEL);
+   if (ifs) {
+   nifs = afs_get_ipv4_interfaces(ifs, 32, false);
+   if (nifs  0) {
+   kfree(ifs);
+   ifs = NULL;
+   nifs = 0;
+   }
+   }
+
+   memset(reply, 0, sizeof(reply));
+   reply.ia.nifs = htonl(nifs);
+
+   reply.ia.uuid[0] = htonl(afs_uuid.time_low);
+   reply.ia.uuid[1] = htonl(afs_uuid.time_mid);
+   reply.ia.uuid[2] = htonl(afs_uuid.time_hi_and_version);
+   reply.ia.uuid[3] = htonl((s8

[PATCH 12/16] AFS: Update the AFS fs documentation [try #4]

2007-04-25 Thread David Howells
Update the AFS fs documentation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/filesystems/afs.txt |  214 +++--
 1 files changed, 154 insertions(+), 60 deletions(-)

diff --git a/Documentation/filesystems/afs.txt 
b/Documentation/filesystems/afs.txt
index 2f4237d..12ad6c7 100644
--- a/Documentation/filesystems/afs.txt
+++ b/Documentation/filesystems/afs.txt
@@ -1,31 +1,82 @@
+
 kAFS: AFS FILESYSTEM
 
 
-ABOUT
-=
+Contents:
+
+ - Overview.
+ - Usage.
+ - Mountpoints.
+ - Proc filesystem.
+ - The cell database.
+ - Security.
+ - Examples.
+
+
+
+OVERVIEW
+
 
-This filesystem provides a fairly simple AFS filesystem driver. It is under
-development and only provides very basic facilities. It does not yet support
-the following AFS features:
+This filesystem provides a fairly simple secure AFS filesystem driver. It is
+under development and does not yet provide the full feature set.  The features
+it does support include:
 
-   (*) Write support.
-   (*) Communications security.
-   (*) Local caching.
-   (*) pioctl() system call.
-   (*) Automatic mounting of embedded mountpoints.
+ (*) Security (currently only AFS kaserver and KerberosIV tickets).
 
+ (*) File reading.
 
+ (*) Automounting.
+
+It does not yet support the following AFS features:
+
+ (*) Write support.
+
+ (*) Local caching.
+
+ (*) pioctl() system call.
+
+
+===
+COMPILATION
+===
+
+The filesystem should be enabled by turning on the kernel configuration
+options:
+
+   CONFIG_AF_RXRPC - The RxRPC protocol transport
+   CONFIG_RXKAD- The RxRPC Kerberos security handler
+   CONFIG_AFS  - The AFS filesystem
+
+Additionally, the following can be turned on to aid debugging:
+
+   CONFIG_AF_RXRPC_DEBUG   - Permit AF_RXRPC debugging to be enabled
+   CONFIG_AFS_DEBUG- Permit AFS debugging to be enabled
+
+They permit the debugging messages to be turned on dynamically by manipulating
+the masks in the following files:
+
+   /sys/module/af_rxrpc/parameters/debug
+   /sys/module/afs/parameters/debug
+
+
+=
 USAGE
 =
 
 When inserting the driver modules the root cell must be specified along with a
 list of volume location server IP addresses:
 
-   insmod rxrpc.o
+   insmod af_rxrpc.o
+   insmod rxkad.o
insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
 
-The first module is a driver for the RxRPC remote operation protocol, and the
-second is the actual filesystem driver for the AFS filesystem.
+The first module is the AF_RXRPC network protocol driver.  This provides the
+RxRPC remote operation protocol and may also be accessed from userspace.  See:
+
+   Documentation/networking/rxrpc.txt
+
+The second module is the kerberos RxRPC security driver, and the third module
+is the actual filesystem driver for the AFS filesystem.
 
 Once the module has been loaded, more modules can be added by the following
 procedure:
@@ -33,7 +84,7 @@ procedure:
echo add grand.central.org 18.7.14.88:128.2.191.224 /proc/fs/afs/cells
 
 Where the parameters to the add command are the name of a cell and a list of
-volume location servers within that cell.
+volume location servers within that cell, with the latter separated by colons.
 
 Filesystems can be mounted anywhere by commands similar to the following:
 
@@ -42,11 +93,6 @@ Filesystems can be mounted anywhere by commands similar to 
the following:
mount -t afs #root.afs. /afs
mount -t afs #root.cell. /afs/cambridge
 
-  NB: When using this on Linux 2.4, the mount command has to be different,
-  since the filesystem doesn't have access to the device name argument:
-
-   mount -t afs none /afs -ovol=#root.afs.
-
 Where the initial character is either a hash or a percent symbol depending on
 whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O
 volume, but are willing to use a R/W volume instead (percent).
@@ -60,55 +106,66 @@ named volume will be looked up in the cell specified 
during insmod.
 Additional cells can be added through /proc (see later section).
 
 
+===
 MOUNTPOINTS
 ===
 
-AFS has a concept of mountpoints. These are specially formatted symbolic links
-(of the same form as the device name passed to mount). kAFS presents these
-to the user as directories that have special properties:
+AFS has a concept of mountpoints. In AFS terms, these are specially formatted
+symbolic links (of the same form as the device name passed to mount).  kAFS
+presents these to the user as directories that have a follow-link capability
+(ie: symbolic link semantics).  If anyone attempts to access them, they will
+automatically cause the target volume to be mounted (if possible) on that site.
 
-  (*) They cannot

[PATCH 04/16] AF_RXRPC: Make it possible to merely try to cancel timers from a module [try #4]

2007-04-25 Thread David Howells
Export try_to_del_timer_sync() for use by the AF_RXRPC module.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 kernel/timer.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index dd6c2c1..b22bd39 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -505,6 +505,8 @@ out:
return ret;
 }
 
+EXPORT_SYMBOL(try_to_del_timer_sync);
+
 /**
  * del_timer_sync - deactivate a timer and wait for the handler to finish.
  * @timer: the timer to be deactivated

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]

2007-04-25 Thread David Howells
David Miller [EMAIL PROTECTED] wrote:

 Then please generate your patches against my net-2.6.21 GIT
 tree.  Most of your initial patches in the series (the SKB
 routine one for example) are already in my tree.

Do you mean your net-2.6.22 GIT tree?

Do you want me to make it available as a GIT tree for you to pull?  Or would
you prefer patches?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Transforming code to using ICMP packet accessors

2007-04-26 Thread David Howells

How do I convert:

addr = *(__be32 *)(skb-nh.raw + serr-addr_offset);

to use the ICMP accessor macros now that skb-nh is no longer available?  I
was using this to pluck an address out of the ICMP packet payload, but 

void rxrpc_UDP_error_report(struct sock *sk)
{
struct sock_exterr_skb *serr;
...
struct sk_buff *skb;
__be32 addr;
__be16 port;
...

skb = skb_dequeue(sk-sk_error_queue);
if (!skb) {
_leave(UDP socket errqueue empty);
return;
}
...

serr = SKB_EXT_ERR(skb);
addr = *(__be32 *)(skb-nh.raw + serr-addr_offset);
port = serr-port;

_net(Rx UDP Error from NIPQUAD_FMT:%hu,
 NIPQUAD(addr), ntohs(port));
...
}

Should I do this?:

addr = *(__be32 *)(skb_network_header(skb) + serr-addr_offset);

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Transforming code to using ICMP packet accessors

2007-04-26 Thread David Howells
Arnaldo Carvalho de Melo [EMAIL PROTECTED] wrote:

 Yes, as this is the only use in this function.

Thanks!

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/14] cancel_delayed_work: use del_timer() instead of del_timer_sync() [net-2.6]

2007-04-26 Thread David Howells
del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
efficient since it locks the timer unconditionally, and may wait for the
completion of the delayed_work_timer_fn().

cancel_delayed_work() == 0 means:

before this patch:
work-func may still be running or queued

after this patch:
work-func may still be running or queued, or
delayed_work_timer_fn-__queue_work() in progress.

The latter doesn't differ from the caller's POV,
delayed_work_timer_fn() is called with _PENDING
bit set.

cancel_delayed_work() == 1 with this patch adds a new possibility:

delayed_work-work was cancelled, but delayed_work_timer_fn
is still running (this is only possible for the re-arming
works on single-threaded workqueue).

In this case the timer was re-started by work-func(), nobody
else can do this. This in turn means that delayed_work_timer_fn
has already passed __queue_work() (and wont't touch delayed_work)
because nobody else can queue delayed_work-work.

Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]
Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/workqueue.h |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2a7b38d..b8abfc7 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -191,14 +191,15 @@ int execute_in_process_context(work_func_t fn, struct 
execute_work *);
 
 /*
  * Kill off a pending schedule_delayed_work().  Note that the work callback
- * function may still be running on return from cancel_delayed_work().  Run
- * flush_scheduled_work() to wait on it.
+ * function may still be running on return from cancel_delayed_work(), unless
+ * it returns 1 and the work doesn't re-arm itself. Run flush_workqueue() or
+ * cancel_work_sync() to wait on it.
  */
 static inline int cancel_delayed_work(struct delayed_work *work)
 {
int ret;
 
-   ret = del_timer_sync(work-timer);
+   ret = del_timer(work-timer);
if (ret)
work_release(work-work);
return ret;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/14] AF_RXRPC socket family and AFS rewrite [net-2.6]

2007-04-26 Thread David Howells

[This set of patches is built against Dave Miller's net-2.6 GIT tree]

The first of these patches together provide secure client-side RxRPC
connectivity as a Linux kernel socket family.  Only the RxRPC transport/session
side is supplied - the presentation side (marshalling the data) is left to the
client.  Copies of the patches can be found here:

http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/series

http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/01-cancel_delayed_work.diff
http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/02-keys.diff
http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/03-timer-exports.diff
http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/04-af_rxrpc.diff

Further patches make the in-kernel AFS filesystem use AF_RXRPC and delete the
old RxRPC implementation:

http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/05-afs-cleanup.diff
http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/06-af_rxrpc-kernel.diff
http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/07-af_rxrpc-afs.diff

http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/08-af_rxrpc-delete-old.diff

And then the rest of the patches extend AFS to provide automatic unmounting of
automount trees, security support and directory-level write support (create,
mkdir, etc.):

http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/09-afs-multimount.diff
http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/10-afs-security.diff
http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/11-afs-doc.diff

http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/12-afs-get-capabilities.diff

http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/13-afs-initcallbackstate3.diff

http://people.redhat.com/~dhowells/rxrpc/nfs-2.6/14-afs-dir-write-support.diff

Note that file-level write support is not yet complete and so is not included
in this patch set.


The userspace access methods make use of the control data passed to/by
sendmsg() and recvmsg().  See the three simple test programs:

http://people.redhat.com/~dhowells/rxrpc/klog.c
http://people.redhat.com/~dhowells/rxrpc/rxrpc.c
http://people.redhat.com/~dhowells/rxrpc/listen.c

The klog program is provided to go and get a Kerberos IV key from the AFS
kaserver.  Currently it must be edited before compiling to note the right
server IP address and the appropriate credentials.

These programs can be compiled by:

make klog rxrpc listen CFLAGS=-Wall -g LDLIBS=-lcrypto -lcrypt 
-lkrb4 -lkeyutils

Then a ticket can be obtained by:

./klog

If a security key is acquired in this way, then all subsequent AFS operations -
including VL lookups and mounts - performed with that session keyring will be
authenticated using that key.  The key can be viewed like so:

[EMAIL PROTECTED] ~]# keyctl show
Session Keyring
   -3 --alswrv  0 0  keyring: _ses.3268
2 --alswrv  0 0   \_ keyring: _uid.0
111416553 --als--v  0 0   \_ rxrpc: [EMAIL PROTECTED]

TODO:

 (*) Make certain parameters (such as connection timeouts) userspace
 configurable.

 (*) Make userspace utilities use it; librxrpc.

 (*) Userspace documentation.

 (*) KerberosV security.

Changes:

 (*) SOCK_RPC has been removed.  SOCK_DGRAM is now used instead.

 (*) I've add a facility whereby calls can be made to destinations other than
 the connect() address of a client socket by making use of msg_name in the
 msghdr struct when using sendmsg() to send the first data packet of a
 call.  Indeed, a client socket need not be connected before being used
 so.

 (*) I've also added a facility whereby client calls may also be made on
 server sockets, again by using msg_name in the msghdr struct.  In such a
 case, the server's local transport endpoint is used.

 (*) I've made the write buffer space check available to various callers
 (sk_write_space) and implemented poll support.

 (*) Rewrote rxrpc_recvmsg().  It now concatenates adjacent data messages from
 the same call when delivering them.

 (*) Updated the documentation to include notes on recvmsg, cover control
 messages and cover SOL_RXRPC-level socket options.

 (*) Provided an in-kernel interface to give in-kernel utilities easier access
 to the facility.

 (*) Made fs/afs/ use it.

 (*) Deleted the old contents of net/rxrpc/.

 (*) Use the scatterlist interface to the crypto API for now.  The patch that
 added the direct access interface conflicts with patches Herbert Xu is
 producing, so I've dropped it for the moment.

 (*) Moved a bug fix to make secure connection reuse work from the
 af_rxrpc-kernel patch to the af_rxrpc main patch.

 (*) Make RxRPC use its own private work queues rather than keventd's to avoid
 deadlocks when AFS tries to use keventd too.  This also puts encryption
 in the private work queue rather than keventd's queue as that might 

[PATCH 02/14] AF_RXRPC: Key facility changes for AF_RXRPC [net-2.6]

2007-04-26 Thread David Howells
Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/keys.txt  |   12 
 include/linux/key.h |2 ++
 security/keys/keyring.c |2 ++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d..81d9aa0 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents for more information.
void unregister_key_type(struct key_type *type);
 
 
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys.  The facility provides access to the keyring type for managing
+such a bundle:
+
+   struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings.  A keyring thus found can then be searched
+with keyring_search().  Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
 ===
 NOTES ON ACCESSING PAYLOAD CONTENTS
 ===
diff --git a/include/linux/key.h b/include/linux/key.h
index 169f05e..a9220e7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -160,6 +160,8 @@ struct key {
 */
union {
struct list_headlink;
+   unsigned long   x[2];
+   void*p[2];
} type_data;
 
/* key data
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index ad45ce7..88292e3 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -66,6 +66,8 @@ struct key_type key_type_keyring = {
.read   = keyring_read,
 };
 
+EXPORT_SYMBOL(key_type_keyring);
+
 /*
  * semaphore to serialise link/link calls to prevent two link calls in parallel
  * introducing a cycle

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/14] AF_RXRPC: Make it possible to merely try to cancel timers from a module [net-2.6]

2007-04-26 Thread David Howells
Export try_to_del_timer_sync() for use by the AF_RXRPC module.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 kernel/timer.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index dd6c2c1..b22bd39 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -505,6 +505,8 @@ out:
return ret;
 }
 
+EXPORT_SYMBOL(try_to_del_timer_sync);
+
 /**
  * del_timer_sync - deactivate a timer and wait for the handler to finish.
  * @timer: the timer to be deactivated

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/14] AFS: Handle multiple mounts of an AFS superblock correctly [net-2.6]

2007-04-26 Thread David Howells
Handle multiple mounts of an AFS superblock correctly, checking to see whether
the superblock is already initialised after calling sget() rather than just
unconditionally stamping all over it.

Also delete the silent parameter to afs_fill_super() as it's not used and
can, in any case, be obtained from sb-s_flags.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/super.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/afs/super.c b/fs/afs/super.c
index efc4fe6..77e6875 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -212,7 +212,7 @@ static int afs_test_super(struct super_block *sb, void 
*data)
 /*
  * fill in the superblock
  */
-static int afs_fill_super(struct super_block *sb, void *data, int silent)
+static int afs_fill_super(struct super_block *sb, void *data)
 {
struct afs_mount_params *params = data;
struct afs_super_info *as = NULL;
@@ -319,17 +319,23 @@ static int afs_get_sb(struct file_system_type *fs_type,
goto error;
}
 
-   sb-s_flags = flags;
-
-   ret = afs_fill_super(sb, params, flags  MS_SILENT ? 1 : 0);
-   if (ret  0) {
-   up_write(sb-s_umount);
-   deactivate_super(sb);
-   goto error;
+   if (!sb-s_root) {
+   /* initial superblock/root creation */
+   _debug(create);
+   sb-s_flags = flags;
+   ret = afs_fill_super(sb, params);
+   if (ret  0) {
+   up_write(sb-s_umount);
+   deactivate_super(sb);
+   goto error;
+   }
+   sb-s_flags |= MS_ACTIVE;
+   } else {
+   _debug(reuse);
+   ASSERTCMP(sb-s_flags, , MS_ACTIVE);
}
-   sb-s_flags |= MS_ACTIVE;
-   simple_set_mnt(mnt, sb);
 
+   simple_set_mnt(mnt, sb);
afs_put_volume(params.volume);
afs_put_cell(params.default_cell);
_leave( = 0 [%p], sb);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/14] AFS: Update the AFS fs documentation [net-2.6]

2007-04-26 Thread David Howells
Update the AFS fs documentation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/filesystems/afs.txt |  214 +++--
 1 files changed, 154 insertions(+), 60 deletions(-)

diff --git a/Documentation/filesystems/afs.txt 
b/Documentation/filesystems/afs.txt
index 2f4237d..12ad6c7 100644
--- a/Documentation/filesystems/afs.txt
+++ b/Documentation/filesystems/afs.txt
@@ -1,31 +1,82 @@
+
 kAFS: AFS FILESYSTEM
 
 
-ABOUT
-=
+Contents:
+
+ - Overview.
+ - Usage.
+ - Mountpoints.
+ - Proc filesystem.
+ - The cell database.
+ - Security.
+ - Examples.
+
+
+
+OVERVIEW
+
 
-This filesystem provides a fairly simple AFS filesystem driver. It is under
-development and only provides very basic facilities. It does not yet support
-the following AFS features:
+This filesystem provides a fairly simple secure AFS filesystem driver. It is
+under development and does not yet provide the full feature set.  The features
+it does support include:
 
-   (*) Write support.
-   (*) Communications security.
-   (*) Local caching.
-   (*) pioctl() system call.
-   (*) Automatic mounting of embedded mountpoints.
+ (*) Security (currently only AFS kaserver and KerberosIV tickets).
 
+ (*) File reading.
 
+ (*) Automounting.
+
+It does not yet support the following AFS features:
+
+ (*) Write support.
+
+ (*) Local caching.
+
+ (*) pioctl() system call.
+
+
+===
+COMPILATION
+===
+
+The filesystem should be enabled by turning on the kernel configuration
+options:
+
+   CONFIG_AF_RXRPC - The RxRPC protocol transport
+   CONFIG_RXKAD- The RxRPC Kerberos security handler
+   CONFIG_AFS  - The AFS filesystem
+
+Additionally, the following can be turned on to aid debugging:
+
+   CONFIG_AF_RXRPC_DEBUG   - Permit AF_RXRPC debugging to be enabled
+   CONFIG_AFS_DEBUG- Permit AFS debugging to be enabled
+
+They permit the debugging messages to be turned on dynamically by manipulating
+the masks in the following files:
+
+   /sys/module/af_rxrpc/parameters/debug
+   /sys/module/afs/parameters/debug
+
+
+=
 USAGE
 =
 
 When inserting the driver modules the root cell must be specified along with a
 list of volume location server IP addresses:
 
-   insmod rxrpc.o
+   insmod af_rxrpc.o
+   insmod rxkad.o
insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
 
-The first module is a driver for the RxRPC remote operation protocol, and the
-second is the actual filesystem driver for the AFS filesystem.
+The first module is the AF_RXRPC network protocol driver.  This provides the
+RxRPC remote operation protocol and may also be accessed from userspace.  See:
+
+   Documentation/networking/rxrpc.txt
+
+The second module is the kerberos RxRPC security driver, and the third module
+is the actual filesystem driver for the AFS filesystem.
 
 Once the module has been loaded, more modules can be added by the following
 procedure:
@@ -33,7 +84,7 @@ procedure:
echo add grand.central.org 18.7.14.88:128.2.191.224 /proc/fs/afs/cells
 
 Where the parameters to the add command are the name of a cell and a list of
-volume location servers within that cell.
+volume location servers within that cell, with the latter separated by colons.
 
 Filesystems can be mounted anywhere by commands similar to the following:
 
@@ -42,11 +93,6 @@ Filesystems can be mounted anywhere by commands similar to 
the following:
mount -t afs #root.afs. /afs
mount -t afs #root.cell. /afs/cambridge
 
-  NB: When using this on Linux 2.4, the mount command has to be different,
-  since the filesystem doesn't have access to the device name argument:
-
-   mount -t afs none /afs -ovol=#root.afs.
-
 Where the initial character is either a hash or a percent symbol depending on
 whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O
 volume, but are willing to use a R/W volume instead (percent).
@@ -60,55 +106,66 @@ named volume will be looked up in the cell specified 
during insmod.
 Additional cells can be added through /proc (see later section).
 
 
+===
 MOUNTPOINTS
 ===
 
-AFS has a concept of mountpoints. These are specially formatted symbolic links
-(of the same form as the device name passed to mount). kAFS presents these
-to the user as directories that have special properties:
+AFS has a concept of mountpoints. In AFS terms, these are specially formatted
+symbolic links (of the same form as the device name passed to mount).  kAFS
+presents these to the user as directories that have a follow-link capability
+(ie: symbolic link semantics).  If anyone attempts to access them, they will
+automatically cause the target volume to be mounted (if possible) on that site.
 
-  (*) They cannot

[PATCH 12/14] AFS: Add support for the CB.GetCapabilities operation [net-2.6]

2007-04-26 Thread David Howells
Add support for the CB.GetCapabilities operation with which the fileserver can
ask the client for the following information:

 (1) The list of network interfaces it has available as IPv4 address + netmask
 plus the MTUs.

 (2) The client's UUID.

 (3) The extended capabilities of the client, for which the only current one
 is unified error mapping (abort code interpretation).

To support this, the patch adds the following routines to AFS:

 (1) A function to iterate through all the network interfaces using RTNETLINK
 to extract IPv4 addresses and MTUs.

 (2) A function to iterate through all the network interfaces using RTNETLINK
 to pull out the MAC address of the lowest index interface to use in UUID
 construction.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/Makefile|1 
 fs/afs/afs_cm.h|3 
 fs/afs/cmservice.c |   98 ++
 fs/afs/internal.h  |   42 
 fs/afs/main.c  |   49 +
 fs/afs/rxrpc.c |   39 
 fs/afs/use-rtnetlink.c |  473 
 7 files changed, 705 insertions(+), 0 deletions(-)

diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index cca198b..01545eb 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -18,6 +18,7 @@ kafs-objs := \
security.o \
server.o \
super.o \
+   use-rtnetlink.o \
vlclient.o \
vlocation.o \
vnode.o \
diff --git a/fs/afs/afs_cm.h b/fs/afs/afs_cm.h
index 7c8e3d4..d4bd201 100644
--- a/fs/afs/afs_cm.h
+++ b/fs/afs/afs_cm.h
@@ -23,6 +23,9 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
+#define AFS_CAP_ERROR_TRANSLATION  0x1
+
 #endif /* AFS_FS_H */
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index c3ec57a..a6af3ac 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -22,6 +22,8 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *,
   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
+static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
+  bool);
 static void afs_cm_destructor(struct afs_call *);
 
 /*
@@ -55,6 +57,16 @@ static const struct afs_call_type afs_SRXCBProbe = {
 };
 
 /*
+ * CB.GetCapabilities operation type
+ */
+static const struct afs_call_type afs_SRXCBGetCapabilites = {
+   .name   = CB.GetCapabilities,
+   .deliver= afs_deliver_cb_get_capabilities,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * route an incoming cache manager call
  * - return T if supported, F if not
  */
@@ -74,6 +86,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
+   case CBGetCapabilities:
+   call-type = afs_SRXCBGetCapabilites;
+   return true;
default:
return false;
}
@@ -328,3 +343,86 @@ static int afs_deliver_cb_probe(struct afs_call *call, 
struct sk_buff *skb,
schedule_work(call-work);
return 0;
 }
+
+/*
+ * allow the fileserver to ask about the cache manager's capabilities
+ */
+static void SRXAFSCB_GetCapabilities(struct work_struct *work)
+{
+   struct afs_interface *ifs;
+   struct afs_call *call = container_of(work, struct afs_call, work);
+   int loop, nifs;
+
+   struct {
+   struct /* InterfaceAddr */ {
+   __be32 nifs;
+   __be32 uuid[11];
+   __be32 ifaddr[32];
+   __be32 netmask[32];
+   __be32 mtu[32];
+   } ia;
+   struct /* Capabilities */ {
+   __be32 capcount;
+   __be32 caps[1];
+   } cap;
+   } reply;
+
+   _enter();
+
+   nifs = 0;
+   ifs = kcalloc(32, sizeof(*ifs), GFP_KERNEL);
+   if (ifs) {
+   nifs = afs_get_ipv4_interfaces(ifs, 32, false);
+   if (nifs  0) {
+   kfree(ifs);
+   ifs = NULL;
+   nifs = 0;
+   }
+   }
+
+   memset(reply, 0, sizeof(reply));
+   reply.ia.nifs = htonl(nifs);
+
+   reply.ia.uuid[0] = htonl(afs_uuid.time_low);
+   reply.ia.uuid[1] = htonl(afs_uuid.time_mid);
+   reply.ia.uuid[2] = htonl(afs_uuid.time_hi_and_version);
+   reply.ia.uuid[3] = htonl((s8

[PATCH 13/14] AFS: Implement the CB.InitCallBackState3 operation [net-2.6]

2007-04-26 Thread David Howells
Implement the CB.InitCallBackState3 operation for the fileserver to call.
This reduces the amount of network traffic because if this op is aborted, the
fileserver will then attempt an CB.InitCallBackState operation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/afs_cm.h|1 +
 fs/afs/cmservice.c |   46 ++
 2 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/fs/afs/afs_cm.h b/fs/afs/afs_cm.h
index d4bd201..7b4d4fa 100644
--- a/fs/afs/afs_cm.h
+++ b/fs/afs/afs_cm.h
@@ -23,6 +23,7 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBInitCallBackState3= 213,  /* initialise callback state, version 3 
*/
CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index a6af3ac..6685f4c 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -20,6 +20,8 @@ struct workqueue_struct *afs_cm_workqueue;
 
 static int afs_deliver_cb_init_call_back_state(struct afs_call *,
   struct sk_buff *, bool);
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *,
+   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
@@ -47,6 +49,16 @@ static const struct afs_call_type afs_SRXCBInitCallBackState 
= {
 };
 
 /*
+ * CB.InitCallBackState3 operation type
+ */
+static const struct afs_call_type afs_SRXCBInitCallBackState3 = {
+   .name   = CB.InitCallBackState3,
+   .deliver= afs_deliver_cb_init_call_back_state3,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * CB.Probe operation type
  */
 static const struct afs_call_type afs_SRXCBProbe = {
@@ -83,6 +95,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBInitCallBackState:
call-type = afs_SRXCBInitCallBackState;
return true;
+   case CBInitCallBackState3:
+   call-type = afs_SRXCBInitCallBackState3;
+   return true;
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
@@ -312,6 +327,37 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *call,
 }
 
 /*
+ * deliver request data to a CB.InitCallBackState3 call
+ */
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *call,
+   struct sk_buff *skb,
+   bool last)
+{
+   struct afs_server *server;
+   struct in_addr addr;
+
+   _enter(,{%u},%d, skb-len, last);
+
+   if (!last)
+   return 0;
+
+   /* no unmarshalling required */
+   call-state = AFS_CALL_REPLYING;
+
+   /* we'll need the file server record as that tells us which set of
+* vnodes to operate upon */
+   memcpy(addr, ip_hdr(skb)-saddr, 4);
+   server = afs_find_server(addr);
+   if (!server)
+   return -ENOTCONN;
+   call-server = server;
+
+   INIT_WORK(call-work, SRXAFSCB_InitCallBackState);
+   schedule_work(call-work);
+   return 0;
+}
+
+/*
  * allow the fileserver to see if the cache manager is still alive
  */
 static void SRXAFSCB_Probe(struct work_struct *work)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   6   7   8   9   >