date:20071210

On Wed, Dec 05, 2007 at 11:28:34PM -0500, Hideo AOKI wrote:

 1. Using sk_forward_alloc and adding socket lock
 
UDP already uses a socket lock to send message. However, it doesn't
use the lock to receive message. I wonder if we can also use the
lock when sk_forward_alloc is updated in receive processing.
I understand performance issue might occur, but ...

Having discussed this with Dave we've agreed that this is the
best way to go.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

routing policy based on u32 classifier

2007-12-10 Thread Marco Berizzi

Hello everybody.
Kindly, I would like to know if the is any plan
to add this feature to a future kernel release.
I know that fwmark is able to do this, but there
is the limitation in source ip address selection.

TIA


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/8] [TFRC]: Put RX/TX initialisation into tfrc.c

|  This separates RX/TX initialisation and puts all packet history / loss 
intervals
|  initialisation into tfrc.c.
|  The organisation is uniform: slab declaration - {rx,tx}_init() - 
{rx,tx}_exit()
| 
| NAK, you can't call a __exit marked routine from a __init marked
| routine.
| 
Ok thanks, will fix that in revision 2.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/8] [TFRC]: Loss interval code needs the macros/inlines that were moved

|  
|distcc[24516] ERROR: compile 
/root/.ccache/packet_his.tmp.aspire.home.net.24512.i on _tiptop failed
|/usr/src/davem-2.6/net/dccp/ccids/lib/packet_history.c: In function 
'__one_after_loss':
|/usr/src/davem-2.6/net/dccp/ccids/lib/packet_history.c:266: error: lvalue 
required as unary '' operand
snip
| 
| Because you do it this way:
| 
| tfrc_rx_hist_swap(TFRC_RX_HIST_ENTRY(h, 0), TFRC_RX_HIST_ENTRY(h, 3));
| 
| I checked and at least in this patch series all uses are of this type,
| so why not do it using just the indexes, which would be simpler:
| 
| tfrc_rx_hist_swap(h, 0, 3);
| 
| With this implementation:
| 
| static void tfrc_rx_hist_swap(struct tfrc_rx_hist *h, const int a, const int 
b)
| {
|   const int idx_a = tfrc_rx_hist_index(h, a),
| int idx_b = tfrc_rx_hist_index(h, b);
|   struct tfrc_rx_hist_entry *tmp = h-ring[idx_a];
| 
|   h-ring[idx_a] = h-ring[idx_b];
|   h-ring[idx_b] = tmp;
| }
| 
Agreed, that is useful in the present case, since then everything uses
inlines. The only suggestion I'd like to make is to use `u8' instead of 
`int' since the indices will have very low values.

There is a related point: you will probably have noticed that loss_interval.c 
also uses macros. I don't know if you are planning to convert these also into 
inlines. I think that there would be less benefit in converting these, since
they are locl to loss_interval.c and mostly serve to improve readability.

As I have at least one other patch to revise (plus another minor one),
I'll rework this according to the above. 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/3] UCC TDM driver for MPC83xx platforms

2007-12-10 Thread Aggrwal Poonam


There are three patches
[PATCH 1/3] drivers/misc : UCC TDM driver for mpc83xx platforms. This
driver is usable in VoIP iind of applications to interface with SLIC
kind of devices to exchange TDM voice samples.

[PATCH 2/3] arch/ : Platform changes
- device tree entries for UCC TDM driver for MPC8323ERDB platform.
- QE changes related to TDM , like,
 1) Modified ucc_fast_init so that it can be used by fast UCC
based TDM driver. Mainly changes have been made to configure TDM clocks
and Fsyncs.

2) Modified get_brg_clk so that it can return the input frequncy
and input source of any BRG by reading the corresponding entries from
device tree.

3) Added new nodes brg and clocks in the device tree which
represent input clocks for different BRGs.

4) Modified qe_setbrg accordingly.
- new device tree entries added for clocks and brg

[PATCH 3/3] Documentation
- Modified Documentation to explain the device tree entries related to
UCC TDM driver and the new nodes added(clocks and brg)

The patch applies over a merge of galak's for-2.6.25 plus for-2.6.24
plus of_doc_update branches.
In brief the steps were
git clone
git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git
powerpc-galak
git checkout -b for-2.6.25 origin/for-2.6.25
git checkout -b for-2.6.24 origin/for-2.6.24
git checkout -b of_doc_update origin/of_doc_update
git pull . for-2.6.24# merge the other two
git pull . for-2.6.25 
git checkout -b tdm  # clean slate for tdm rebase work

Also after applying the patches changes have to be made corresponding to
Tabi's patch qe: add function qe_clock_source.

The driver has been tested with a VoIP stack and application on
MPC8323ERDB.

With Regards
Poonam 
 
 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] UCC based TDM driver for MPC83xx platforms.

From: Poonam Agarwal-b10812 [EMAIL PROTECTED]

The UCC TDM driver basically multiplexes and demultiplexes data from 
different channels. It can interface with for example SLIC kind of devices 
to receive TDM data  demultiplex it and send to upper applications. At the 
transmit end it receives data for different channels multiplexes it and 
sends them on the TDM channel. It internally uses TSA( Time Slot Assigner) 
which does multiplexing and demultiplexing, UCC to perform SDMA between 
host buffers and the TSA, CMX to connect TSA to UCC.

This driver will run on MPC8323E-RDB platforms.

Signed-off-by: Poonam Aggrwal [EMAIL PROTECTED]
Signed-off-by: Ashish Kalra [EMAIL PROTECTED]
Signed-off-by: Kim Phillips [EMAIL PROTECTED]
Signed-off-by: Michael Barkowski [EMAIL PROTECTED]
---

 drivers/misc/Kconfig   |   21 +
 drivers/misc/Makefile  |1 +
 drivers/misc/ucc_tdm.c | 1068 
 drivers/misc/ucc_tdm.h |  227 ++
 4 files changed, 1317 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/ucc_tdm.c
 create mode 100644 drivers/misc/ucc_tdm.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index b5e67c0..698a72c 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -219,6 +219,27 @@ config THINKPAD_ACPI_BAY
 
  If you are not sure, say Y here.
 
+config UCC_TDM
+   bool Freescale UCC  TDM Driver
+   depends on QUICC_ENGINE  UCC_FAST
+   default n
+   ---help---
+ The TDM driver is for UCC based TDM devices for example, TDM device on
+ MPC832x RDB. Select it to run PowerVoIP on MPC832x RDB board.
+ The TDM driver can interface with SLIC kind of devices to transmit
+ and receive TDM samples. The TDM driver receives Time Division
+ multiplexed samples(for different channels) from the SLIC device,
+ demutiplexes them and sends them to the upper layers. At the transmit
+ end the TDM drivers receives samples for different channels, it
+ multiplexes them and sends them to the SLIC device.
+
+config TDM_LINEAR_PCM
+   bool Linear PCM mode
+   depends on UCC_TDM
+   ---help---
+ This mode should be selected if the TDM driver interface with the
+ SLIC device is linear PCM(e.g. 16 bit samples). If not selected the
+ interface will be 8 bit u-law.
 
 config ATMEL_SSC
tristate Device driver for Atmel SSC peripheral
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 87f2685..6f0c49d 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -17,3 +17,4 @@ obj-$(CONFIG_SONY_LAPTOP) += sony-laptop.o
 obj-$(CONFIG_THINKPAD_ACPI)+= thinkpad_acpi.o
 obj-$(CONFIG_FUJITSU_LAPTOP)   += fujitsu-laptop.o
 obj-$(CONFIG_EEPROM_93CX6) += eeprom_93cx6.o
+obj-$(CONFIG_UCC_TDM)  += ucc_tdm.o
diff --git a/drivers/misc/ucc_tdm.c b/drivers/misc/ucc_tdm.c
new file mode 100644
index 000..232d537
--- /dev/null
+++ b/drivers/misc/ucc_tdm.c
@@ -0,0 +1,1068 @@
+/*
+ * drivers/misc/ucc_tdm.c
+ *
+ * UCC Based Linux TDM Driver
+ * This driver is designed to support UCC based TDM for PowerPC processors.
+ * This driver can interface with SLIC device to run VOIP kind of
+ * applications.
+ *
+ * Author: Ashish Kalra  Poonam Aggrwal
+ *
+ * Copyright (c) 2007 Freescale Semiconductor, Inc.
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include linux/autoconf.h
+#include linux/module.h
+#include linux/sched.h
+#include linux/kernel.h
+#include linux/slab.h
+#include linux/errno.h
+#include linux/types.h
+#include linux/interrupt.h
+#include linux/time.h
+#include linux/skbuff.h
+#include linux/proc_fs.h
+#include linux/delay.h
+#include linux/dma-mapping.h
+#include linux/string.h
+#include linux/irq.h
+#include linux/of_platform.h
+#include linux/io.h
+#include linux/wait.h
+#include linux/timer.h
+
+#include asm/immap_qe.h
+#include asm/qe.h
+#include asm/ucc.h
+#include asm/ucc_fast.h
+#include asm/ucc_slow.h
+
+#include ucc_tdm.h
+#define DRV_DESC Freescale QE UCC TDM Driver
+#define DRV_NAME ucc_tdm
+
+/*
+ * define the following #define if snooping or hardware-based cache coherency
+ * is disabled on the UCC transparent controller.This flag enables
+ * software-based cache-coherency support by explicitly flushing data cache
+ * contents after setting up the TDM output buffer(s) and invalidating the
+ * data cache contents before the TDM input buffer(s) are read.
+ */
+#undef UCC_CACHE_SNOOPING_DISABLED
+
+#define MAX_NUM_TDM_DEVICES 8
+
+static struct tdm_ctrl *tdm_ctrl[MAX_NUM_TDM_DEVICES];
+
+static int num_tdm_devices;
+static int num_tdm_clients;
+
+#define PREV_PHASE(x) ((x == 0) ? MAX_PHASE : (x - 1))
+#define NEXT_PHASE(x) (((x + 1)  MAX_PHASE) ? 0 : (x +

[0/4] DST: Distributed storage.


Distributed storage.

I'm pleased to announce the 11'th release of the distributed
storage subsystem (DST). This is a maintenance release and includes
bug fixes and simple feature extensions only.

DST allows to form a storage on top of local and remote nodes
and combine them into linear or mirroring setup, which in
turn can be exported to remote nodes.

Short changelog:
 * wakeup state when mirror detected error to seedup reconnect
 * if connecting in csum mode to no-csum server, do not enable csums
 * do not clean queue until all users are removed
 * allow to increase size of the storage in linear add callback 
(with this change it is possible to add nodes into linear array
in real time without stopping storage. Filesystem has to be prepared
for the case when underlying device has changed its size.
Real-time addon of mirror nodes is also supported)
 * allow to delete gendisk only after device was started
 * dst debug config option
 * Name: Gamardjoba, genacvale! ('Hi friend' in georgian)

Great thanks to Matthew Hodgson [EMAIL PROTECTED] for debugging!

Overall list of features of the DST can be found on project's homepage:

http://tservice.net.ru/~s0mbre/old/?section=projectsitem=dst

Thank you.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[1/4] DST: Distributed storage documentation.


Distributed storage documentation.

Algorithms used in the system, userspace interfaces
(sysfs dirs and files), design and implementation details
are described here.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]


diff --git a/Documentation/dst/algorithms.txt b/Documentation/dst/algorithms.txt
new file mode 100644
index 000..1437a6a
--- /dev/null
+++ b/Documentation/dst/algorithms.txt
@@ -0,0 +1,115 @@
+Each storage by itself is just a set of contiguous logical blocks, with
+allowed number of operations. Nodes, each of which has own start and size,
+are placed into storage by appropriate algorithm, which remaps
+logical sector number into real node's sector. One can create
+own algorithms, since DST has pluggable interface for that.
+Currently mirrored and linear algorithms are supported.
+
+Let's briefly describe how they work.
+
+Linear algorithm.
+Simple approach of concatenating storages into single device with
+increased size is used in this algorithm. Essentially new device
+has size equal to sum of sizes of underlying nodes and nodes are
+placed one after another.
+
+  /- Node 1 ---\ /-- Node 3 \
+start  end start   end
+ |==||==|
+ |start end |
+ |  \--- Node 2 -/  |
+ |  |
+start  end
+ \-- DST storage --/
+
+   /\
+   ||
+   ||
+
+  IO operations
+
+   Figure 1. 
+ 3 nodes combined into single storage using linear algorithm.
+
+Mirror algorithm.
+In this algorithms nodes are placed under each other, so when
+operation comes to the first one, it can be mirrored to all
+underlying nodes. In case of reading, actual data is obtained from
+the nearest node - algoritm keeps track of previous operation
+and knows where it was stopped, so that subsequent seek to the 
+start of the new request will take the shortest time.
+Writing is always mirrored to all underlying nodes.
+
+  IO operations
+   ||
+   ||
+   \/
+
+| DST storage ---|
+|  prev position |
+|---| Node 1 |
+|  prev pos  |
+| Node 2 -|--|
+|prev pos|
+|---| Node 3 |
+
+   Figure 2.
+   3 nodes combined into single storage using mirror algorithm.
+
+Each algorithm must implement number of callbacks,
+which must be registered during initialization time.
+
+struct dst_alg_ops
+{
+   int (*add_node)(struct dst_node *n);
+   void(*del_node)(struct dst_node *n);
+   int (*remap)(struct dst_request *req);
+   int (*error)(struct kst_state *state, int err);
+   struct module   *owner;
+};
+
[EMAIL PROTECTED]
+This callback is invoked when new node is being added into the storage,
+but before node is actually added into the storage, so that it could
+be accessed from it. When it is called, all appropriate initialization
+of the underlying device is already completed (system has been connected
+to remote node or got a reference to the local block device). At this
+stage algorithm can add node into private map. 
+It must return zero on success or negative value otherwise.
+
[EMAIL PROTECTED]
+This callback is invoked when node is being deleted from the storage,
+i.e. when its reference counter hits zero. It is called before
+any cleaning is performed.
+It must return zero on success or negative value otherwise.
+
[EMAIL PROTECTED]
+This callback is invoked each time new bio hits the storage.
+Request structure contains BIO itself, pointer to the node, which originally
+stores the whole region under given IO request, and various parameters
+used by storage core to process this block request.
+It must return zero on success or negative value otherwise. It is upto
+this method to call all cleaning if remapping failed, for example it must
+call kst_bio_endio() for given callback in case of error, which in turn
+will call bio_endio(). Note, that dst_request structure provided in this
+callback is allocated on stack, so if there is a need to use it outside
+of the given function, it must be cloned (it will happen automatically
+in state's push callback, but that copy will not be shared by any other
+user).
+
[EMAIL PROTECTED]
+This callback is invoked for each error, which happend when processed

[3/4] DST: Network state machine.


Network state machine.

Includes network async processing state machine and related tasks.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]


diff --git a/drivers/block/dst/kst.c b/drivers/block/dst/kst.c
new file mode 100644
index 000..8fa3387
--- /dev/null
+++ b/drivers/block/dst/kst.c
@@ -0,0 +1,1513 @@
+/*
+ * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/kernel.h
+#include linux/module.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/socket.h
+#include linux/kthread.h
+#include linux/net.h
+#include linux/in.h
+#include linux/poll.h
+#include linux/bio.h
+#include linux/dst.h
+
+#include net/sock.h
+
+struct kst_poll_helper
+{
+   poll_table  pt;
+   struct kst_state*st;
+};
+
+static LIST_HEAD(kst_worker_list);
+static DEFINE_MUTEX(kst_worker_mutex);
+
+/*
+ * This function creates bound socket for local export node.
+ */
+static int kst_sock_create(struct kst_state *st, struct saddr *addr,
+   int type, int proto, int backlog)
+{
+   int err;
+
+   err = sock_create(addr-sa_family, type, proto, st-socket);
+   if (err)
+   goto err_out_exit;
+
+   err = st-socket-ops-bind(st-socket, (struct sockaddr *)addr,
+   addr-sa_data_len);
+
+   err = st-socket-ops-listen(st-socket, backlog);
+   if (err)
+   goto err_out_release;
+
+   st-socket-sk-sk_allocation = GFP_NOIO;
+
+   return 0;
+
+err_out_release:
+   sock_release(st-socket);
+err_out_exit:
+   return err;
+}
+
+static void kst_sock_release(struct kst_state *st)
+{
+   if (st-socket) {
+   sock_release(st-socket);
+   st-socket = NULL;
+   }
+}
+
+void kst_wake(struct kst_state *st)
+{
+   if (st) {
+   struct kst_worker *w = st-node-w;
+   unsigned long flags;
+
+   spin_lock_irqsave(w-ready_lock, flags);
+   if (list_empty(st-ready_entry))
+   list_add_tail(st-ready_entry, w-ready_list);
+   spin_unlock_irqrestore(w-ready_lock, flags);
+
+   wake_up(w-wait);
+   }
+}
+EXPORT_SYMBOL_GPL(kst_wake);
+
+/*
+ * Polling machinery.
+ */
+static int kst_state_wake_callback(wait_queue_t *wait, unsigned mode,
+   int sync, void *key)
+{
+   struct kst_state *st = container_of(wait, struct kst_state, wait);
+   kst_wake(st);
+   return 1;
+}
+
+static void kst_queue_func(struct file *file, wait_queue_head_t *whead,
+poll_table *pt)
+{
+   struct kst_state *st = container_of(pt, struct kst_poll_helper, pt)-st;
+
+   st-whead = whead;
+   init_waitqueue_func_entry(st-wait, kst_state_wake_callback);
+   add_wait_queue(whead, st-wait);
+}
+
+static void kst_poll_exit(struct kst_state *st)
+{
+   if (st-whead) {
+   remove_wait_queue(st-whead, st-wait);
+   st-whead = NULL;
+   }
+}
+
+/*
+ * This function removes request from state tree and ordering list.
+ */
+void kst_del_req(struct dst_request *req)
+{
+   list_del_init(req-request_list_entry);
+}
+EXPORT_SYMBOL_GPL(kst_del_req);
+
+static struct dst_request *kst_req_first(struct kst_state *st)
+{
+   struct dst_request *req = NULL;
+
+   if (!list_empty(st-request_list))
+   req = list_entry(st-request_list.next, struct dst_request,
+   request_list_entry);
+   return req;
+}
+
+/*
+ * This function dequeues first request from the queue and tree.
+ */
+static struct dst_request *kst_dequeue_req(struct kst_state *st)
+{
+   struct dst_request *req;
+
+   mutex_lock(st-request_lock);
+   req = kst_req_first(st);
+   if (req)
+   kst_del_req(req);
+   mutex_unlock(st-request_lock);
+   return req;
+}
+
+/*
+ * This function enqueues request into tree, indexed by start of the request,
+ * and also puts request into ordered queue.
+ */
+int kst_enqueue_req(struct kst_state *st, struct dst_request *req)
+{
+   if (unlikely(req-flags  DST_REQ_CHECK_QUEUE)) {
+   struct dst_request *r;
+
+   list_for_each_entry(r, st-request_list, request_list_entry) {
+   if (bio_rw(r-bio) != bio_rw(req-bio))
+   continue;
+
+   if (r-start = req-start + req-size)
+   continue;
+
+

[4/4] DST: Algorithms used in distributed storage.


Algorithms used in distributed storage.
Mirror and linear mapping code.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]


diff --git a/drivers/block/dst/alg_linear.c b/drivers/block/dst/alg_linear.c
new file mode 100644
index 000..9dc0976
--- /dev/null
+++ b/drivers/block/dst/alg_linear.c
@@ -0,0 +1,105 @@
+/*
+ * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/module.h
+#include linux/kernel.h
+#include linux/init.h
+#include linux/dst.h
+
+static struct dst_alg *alg_linear;
+
+/*
+ * This callback is invoked when node is removed from storage.
+ */
+static void dst_linear_del_node(struct dst_node *n)
+{
+}
+
+/*
+ * This callback is invoked when node is added to storage.
+ */
+static int dst_linear_add_node(struct dst_node *n)
+{
+   struct dst_storage *st = n-st;
+
+   dprintk(%s: disk_size: %llu, node_size: %llu.\n,
+   __func__, st-disk_size, n-size);
+
+   mutex_lock(st-tree_lock);
+   n-start = st-disk_size;
+   st-disk_size += n-size;
+   set_capacity(st-disk, st-disk_size);
+   mutex_unlock(st-tree_lock);
+
+   return 0;
+}
+
+static int dst_linear_remap(struct dst_request *req)
+{
+   int err;
+
+   if (req-node-bdev) {
+   generic_make_request(req-bio);
+   return 0;
+   }
+
+   err = kst_check_permissions(req-state, req-bio);
+   if (err)
+   return err;
+
+   return req-state-ops-push(req);
+}
+
+/*
+ * Failover callback - it is invoked each time error happens during
+ * request processing.
+ */
+static int dst_linear_error(struct kst_state *st, int err)
+{
+   if (err)
+   set_bit(DST_NODE_FROZEN, st-node-flags);
+   else
+   clear_bit(DST_NODE_FROZEN, st-node-flags);
+   return 0;
+}
+
+static struct dst_alg_ops alg_linear_ops = {
+   .remap  = dst_linear_remap,
+   .add_node   = dst_linear_add_node,
+   .del_node   = dst_linear_del_node,
+   .error  = dst_linear_error,
+   .owner  = THIS_MODULE,
+};
+
+static int __devinit alg_linear_init(void)
+{
+   alg_linear = dst_alloc_alg(alg_linear, alg_linear_ops);
+   if (!alg_linear)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static void __devexit alg_linear_exit(void)
+{
+   dst_remove_alg(alg_linear);
+}
+
+module_init(alg_linear_init);
+module_exit(alg_linear_exit);
+
+MODULE_LICENSE(GPL);
+MODULE_AUTHOR(Evgeniy Polyakov [EMAIL PROTECTED]);
+MODULE_DESCRIPTION(Linear distributed algorithm.);
diff --git a/drivers/block/dst/alg_mirror.c b/drivers/block/dst/alg_mirror.c
new file mode 100644
index 000..3c457ff
--- /dev/null
+++ b/drivers/block/dst/alg_mirror.c
@@ -0,0 +1,1128 @@
+/*
+ * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/module.h
+#include linux/kernel.h
+#include linux/init.h
+#include linux/poll.h
+#include linux/dst.h
+
+struct dst_mirror_node_data
+{
+   u64 age;
+};
+
+struct dst_mirror_priv
+{
+   unsigned intchunk_num;
+
+   u64 last_start;
+
+   spinlock_t  backlog_lock;
+   struct list_headbacklog_list;
+
+   struct dst_mirror_node_data old_data, new_data;
+
+   unsigned long   *chunk;
+};
+
+static struct dst_alg *alg_mirror;
+static struct bio_set *dst_mirror_bio_set;
+
+static int dst_mirror_resync(struct dst_node *n, int ndp);
+
+static void dst_mirror_mark_sync(struct dst_node *n)
+{
+   if (test_bit(DST_NODE_NOTSYNC, n-flags)) {
+   struct dst_mirror_priv *priv = n-priv;
+
+   clear_bit(DST_NODE_NOTSYNC, n-flags);
+   dprintk(%s: node: %p, %llu:%llu synchronization 
+   has been completed.\n,
+   __func__, n, n-start, n-size);
+   priv-old_data.age = 0;
+   }
+}
+

[2/4] DST: Core distributed storage files.


Core distributed storage files.
Include userspace interfaces, initialization,
block layer bindings and other core functionality.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]


diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index b4c8319..ca6592d 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -451,6 +451,8 @@ config ATA_OVER_ETH
This driver provides Support for ATA over Ethernet block
devices like the Coraid EtherDrive (R) Storage Blade.
 
+source drivers/block/dst/Kconfig
+
 source drivers/s390/block/Kconfig
 
 endmenu
diff --git a/drivers/block/Makefile b/drivers/block/Makefile
index dd88e33..fcf042d 100644
--- a/drivers/block/Makefile
+++ b/drivers/block/Makefile
@@ -29,3 +29,4 @@ obj-$(CONFIG_VIODASD) += viodasd.o
 obj-$(CONFIG_BLK_DEV_SX8)  += sx8.o
 obj-$(CONFIG_BLK_DEV_UB)   += ub.o
 
+obj-$(CONFIG_DST)  += dst/
diff --git a/drivers/block/dst/Kconfig b/drivers/block/dst/Kconfig
new file mode 100644
index 000..e91f8ed
--- /dev/null
+++ b/drivers/block/dst/Kconfig
@@ -0,0 +1,28 @@
+config DST
+   tristate Distributed storage
+   depends on NET
+   select CONNECTOR
+   select LIBCRC32C
+   ---help---
+   This driver allows to create a distributed storage.
+
+config DST_DEBUG
+   bool DST debug
+   depends on DST
+   ---help---
+   This option will turn HEAVY debugging of the DST.
+   Turn it on ONLY if you have to debug some really obscure problem.
+
+config DST_ALG_LINEAR
+   tristate Linear distribution algorithm
+   depends on DST
+   ---help---
+   This module allows to create linear mapping of the nodes
+   in the distributed storage.
+
+config DST_ALG_MIRROR
+   tristate Mirror distribution algorithm
+   depends on DST
+   ---help---
+   This module allows to create a mirror of the noes in the
+   distributed storage.
diff --git a/drivers/block/dst/Makefile b/drivers/block/dst/Makefile
new file mode 100644
index 000..1400e94
--- /dev/null
+++ b/drivers/block/dst/Makefile
@@ -0,0 +1,6 @@
+obj-$(CONFIG_DST) += dst.o
+
+dst-y := dcore.o kst.o
+
+obj-$(CONFIG_DST_ALG_LINEAR) += alg_linear.o
+obj-$(CONFIG_DST_ALG_MIRROR) += alg_mirror.o
diff --git a/drivers/block/dst/dcore.c b/drivers/block/dst/dcore.c
new file mode 100644
index 000..17a5e61
--- /dev/null
+++ b/drivers/block/dst/dcore.c
@@ -0,0 +1,1631 @@
+/*
+ * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/module.h
+#include linux/kernel.h
+#include linux/init.h
+#include linux/blkdev.h
+#include linux/bio.h
+#include linux/slab.h
+#include linux/connector.h
+#include linux/socket.h
+#include linux/dst.h
+#include linux/device.h
+#include linux/in.h
+#include linux/in6.h
+#include linux/buffer_head.h
+
+#include net/sock.h
+
+static LIST_HEAD(dst_storage_list);
+static LIST_HEAD(dst_alg_list);
+static DEFINE_MUTEX(dst_storage_lock);
+static DEFINE_MUTEX(dst_alg_lock);
+static int dst_major;
+static struct kst_worker *kst_main_worker;
+static struct cb_id cn_dst_id = { CN_DST_IDX, CN_DST_VAL };
+
+struct kmem_cache *dst_request_cache;
+
+static char dst_name[] = Gamardjoba, genacvale!;
+
+/*
+ * DST sysfs tree. For device called 'storage' which is formed
+ * on top of two nodes this looks like this:
+ *
+ * /sys/devices/storage/
+ * /sys/devices/storage/alg : alg_linear
+ * /sys/devices/storage/n-800/type : R: 192.168.4.80:1025
+ * /sys/devices/storage/n-800/size : 800
+ * /sys/devices/storage/n-800/start : 800
+ * /sys/devices/storage/n-800/clean
+ * /sys/devices/storage/n-800/dirty
+ * /sys/devices/storage/n-0/type : R: 192.168.4.81:1025
+ * /sys/devices/storage/n-0/size : 800
+ * /sys/devices/storage/n-0/start : 0
+ * /sys/devices/storage/n-0/clean
+ * /sys/devices/storage/n-0/dirty
+ * /sys/devices/storage/remove_all_nodes
+ * /sys/devices/storage/nodes : sectors (start [size]): 0 [800] | 800 [800]
+ * /sys/devices/storage/name : storage
+ */
+
+static int dst_dev_match(struct device *dev, struct device_driver *drv)
+{
+   return 1;
+}
+
+static void dst_dev_release(struct device *dev)
+{
+}
+
+static struct bus_type dst_dev_bus_type = {
+   .name   = dst,
+   .match  = dst_dev_match,
+};
+
+static struct device dst_dev = {
+   .bus= dst_dev_bus_type,
+   .release= dst_dev_release
+};
+
+static void dst_node_release(struct device *dev)
+{
+}
+
+static

Re: [PATCH 5/8] [TFRC]: Loss interval code needs the macros/inlines that were moved

2007-12-10 Thread Arnaldo Carvalho de Melo

Em Mon, Dec 10, 2007 at 11:31:53AM +, Gerrit Renker escreveu:
 |  
 |distcc[24516] ERROR: compile 
 /root/.ccache/packet_his.tmp.aspire.home.net.24512.i on _tiptop failed
 |/usr/src/davem-2.6/net/dccp/ccids/lib/packet_history.c: In function 
 '__one_after_loss':
 |/usr/src/davem-2.6/net/dccp/ccids/lib/packet_history.c:266: error: 
 lvalue required as unary '' operand
 snip
 | 
 | Because you do it this way:
 | 
 | tfrc_rx_hist_swap(TFRC_RX_HIST_ENTRY(h, 0), TFRC_RX_HIST_ENTRY(h, 3));
 | 
 | I checked and at least in this patch series all uses are of this type,
 | so why not do it using just the indexes, which would be simpler:
 | 
 | tfrc_rx_hist_swap(h, 0, 3);
 | 
 | With this implementation:
 | 
 | static void tfrc_rx_hist_swap(struct tfrc_rx_hist *h, const int a, const 
 int b)
 | {
 | const int idx_a = tfrc_rx_hist_index(h, a),
 |   int idx_b = tfrc_rx_hist_index(h, b);
 | struct tfrc_rx_hist_entry *tmp = h-ring[idx_a];
 | 
 | h-ring[idx_a] = h-ring[idx_b];
 | h-ring[idx_b] = tmp;
 | }
 | 
 Agreed, that is useful in the present case, since then everything uses
 inlines. The only suggestion I'd like to make is to use `u8' instead of 
 `int' since the indices will have very low values.

Agreed.
 
 There is a related point: you will probably have noticed that loss_interval.c 
 also uses macros. I don't know if you are planning to convert these also into 
 inlines. I think that there would be less benefit in converting these, since
 they are locl to loss_interval.c and mostly serve to improve readability.

In general I'm against using macros for functions, so please always
consider doing things as inlines.

I'll read some more patches today and provide comments as to if I think
it is ok for now to keep it as macros.

 As I have at least one other patch to revise (plus another minor one),
 I'll rework this according to the above. 

Thank you.

- Arnaldo
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/3] UCC TDM driver for MPC83xx platforms

2007-12-10 Thread Aggrwal Poonam

There are three patches
[PATCH 1/3] drivers/misc : UCC TDM driver for mpc83xx platforms. This
driver is usable in VoIP iind of applications to interface with SLIC
kind of devices to exchange TDM voice samples.

[PATCH 2/3] arch/ : Platform changes
- device tree entries for UCC TDM driver for MPC8323ERDB platform.
- QE changes related to TDM , like,
 1) Modified ucc_fast_init so that it can be used by fast UCC
based TDM driver. Mainly changes have been made to configure TDM clocks
and Fsyncs.

2) Modified get_brg_clk so that it can return the input frequncy
and input source of any BRG by reading the corresponding entries from
device tree.

3) Added new nodes brg and clocks in the device tree which
represent input clocks for different BRGs.

4) Modified qe_setbrg accordingly.
- new device tree entries added for clocks and brg

[PATCH 3/3] Documentation
- Modified Documentation to explain the device tree entries related to
UCC TDM driver and the new nodes added(clocks and brg)

The patch applies over a merge of galak's for-2.6.25 plus for-2.6.24
plus of_doc_update branches.
In brief the steps were
git clone
git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git
powerpc-galak git checkout -b for-2.6.25 origin/for-2.6.25 git checkout
-b for-2.6.24 origin/for-2.6.24 git checkout -b of_doc_update
origin/of_doc_update
git pull . for-2.6.24# merge the other two
git pull . for-2.6.25 
git checkout -b tdm  # clean slate for tdm rebase work

Also after applying the patches changes have to be made corresponding to
Tabi's patch qe: add function qe_clock_source.

The driver has been tested with a VoIP stack and application on
MPC8323ERDB.

With Regards
Poonam 
 
 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] drivers/misc :UCC based TDM driver for MPC83xx platforms.

From: Poonam Aggrwal [EMAIL PROTECTED]

The UCC TDM driver basically multiplexes and demultiplexes data from 
different channels. It can interface with for example SLIC kind of devices 
to receive TDM data  demultiplex it and send to upper applications. At the 
transmit end it receives data for different channels multiplexes it and 
sends them on the TDM channel. It internally uses TSA( Time Slot Assigner) 
which does multiplexing and demultiplexing, UCC to perform SDMA between 
host buffers and the TSA, CMX to connect TSA to UCC.

This driver will run on MPC8323E-RDB platforms.


Signed-off-by: Poonam Aggrwal [EMAIL PROTECTED]
Signed-off-by: Ashish Kalra [EMAIL PROTECTED]
Signed-off-by: Kim Phillips [EMAIL PROTECTED]
Signed-off-by: Michael Barkowski [EMAIL PROTECTED]
---
 drivers/misc/Kconfig   |   21 +
 drivers/misc/Makefile  |1 +
 drivers/misc/ucc_tdm.c | 1068 
 drivers/misc/ucc_tdm.h |  227 ++
 4 files changed, 1317 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/ucc_tdm.c
 create mode 100644 drivers/misc/ucc_tdm.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index b5e67c0..698a72c 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -219,6 +219,27 @@ config THINKPAD_ACPI_BAY
 
  If you are not sure, say Y here.
 
+config UCC_TDM
+   bool Freescale UCC  TDM Driver
+   depends on QUICC_ENGINE  UCC_FAST
+   default n
+   ---help---
+ The TDM driver is for UCC based TDM devices for example, TDM device on
+ MPC832x RDB. Select it to run PowerVoIP on MPC832x RDB board.
+ The TDM driver can interface with SLIC kind of devices to transmit
+ and receive TDM samples. The TDM driver receives Time Division
+ multiplexed samples(for different channels) from the SLIC device,
+ demutiplexes them and sends them to the upper layers. At the transmit
+ end the TDM drivers receives samples for different channels, it
+ multiplexes them and sends them to the SLIC device.
+
+config TDM_LINEAR_PCM
+   bool Linear PCM mode
+   depends on UCC_TDM
+   ---help---
+ This mode should be selected if the TDM driver interface with the
+ SLIC device is linear PCM(e.g. 16 bit samples). If not selected the
+ interface will be 8 bit u-law.
 
 config ATMEL_SSC
tristate Device driver for Atmel SSC peripheral
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 87f2685..6f0c49d 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -17,3 +17,4 @@ obj-$(CONFIG_SONY_LAPTOP) += sony-laptop.o
 obj-$(CONFIG_THINKPAD_ACPI)+= thinkpad_acpi.o
 obj-$(CONFIG_FUJITSU_LAPTOP)   += fujitsu-laptop.o
 obj-$(CONFIG_EEPROM_93CX6) += eeprom_93cx6.o
+obj-$(CONFIG_UCC_TDM)  += ucc_tdm.o
diff --git a/drivers/misc/ucc_tdm.c b/drivers/misc/ucc_tdm.c
new file mode 100644
index 000..232d537
--- /dev/null
+++ b/drivers/misc/ucc_tdm.c
@@ -0,0 +1,1068 @@
+/*
+ * drivers/misc/ucc_tdm.c
+ *
+ * UCC Based Linux TDM Driver
+ * This driver is designed to support UCC based TDM for PowerPC processors.
+ * This driver can interface with SLIC device to run VOIP kind of
+ * applications.
+ *
+ * Author: Ashish Kalra  Poonam Aggrwal
+ *
+ * Copyright (c) 2007 Freescale Semiconductor, Inc.
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include linux/autoconf.h
+#include linux/module.h
+#include linux/sched.h
+#include linux/kernel.h
+#include linux/slab.h
+#include linux/errno.h
+#include linux/types.h
+#include linux/interrupt.h
+#include linux/time.h
+#include linux/skbuff.h
+#include linux/proc_fs.h
+#include linux/delay.h
+#include linux/dma-mapping.h
+#include linux/string.h
+#include linux/irq.h
+#include linux/of_platform.h
+#include linux/io.h
+#include linux/wait.h
+#include linux/timer.h
+
+#include asm/immap_qe.h
+#include asm/qe.h
+#include asm/ucc.h
+#include asm/ucc_fast.h
+#include asm/ucc_slow.h
+
+#include ucc_tdm.h
+#define DRV_DESC Freescale QE UCC TDM Driver
+#define DRV_NAME ucc_tdm
+
+/*
+ * define the following #define if snooping or hardware-based cache coherency
+ * is disabled on the UCC transparent controller.This flag enables
+ * software-based cache-coherency support by explicitly flushing data cache
+ * contents after setting up the TDM output buffer(s) and invalidating the
+ * data cache contents before the TDM input buffer(s) are read.
+ */
+#undef UCC_CACHE_SNOOPING_DISABLED
+
+#define MAX_NUM_TDM_DEVICES 8
+
+static struct tdm_ctrl *tdm_ctrl[MAX_NUM_TDM_DEVICES];
+
+static int num_tdm_devices;
+static int num_tdm_clients;
+
+#define PREV_PHASE(x) ((x == 0) ? MAX_PHASE : (x - 1))
+#define NEXT_PHASE(x) (((x + 1)  MAX_PHASE) ? 0 : (x + 1))
+

[PATCH 2/3] arch/ : Platform changes for UCC TDM driver for MPC8323ERDB.Also includes related QE changes.

From: Poonam Aggrwal [EMAIL PROTECTED]

This patch makes necessary changes in the QE and UCC framework to support 
TDM. It also adds support to configure the BRG properly through device 
tree entries. Includes the device tree changes for UCC TDM driver as well.
It also includes device tree entries for UCC TDM driver.

Tested on MPC8323ERDB platform.

Signed-off-by: Poonam Aggrwal [EMAIL PROTECTED]
Signed-off-by: Ashish Kalra [EMAIL PROTECTED]
Signed-off-by: Kim Phillips [EMAIL PROTECTED]
Signed-off-by: Michael Barkowski [EMAIL PROTECTED]
---
 arch/powerpc/boot/dts/mpc832x_rdb.dts |   58 +++
 arch/powerpc/sysdev/qe_lib/qe.c   |  128 ++--
 arch/powerpc/sysdev/qe_lib/ucc.c  |  265 +
 arch/powerpc/sysdev/qe_lib/ucc_fast.c |   37 +
 include/asm-powerpc/qe.h  |8 +
 include/asm-powerpc/ucc.h |4 +
 include/asm-powerpc/ucc_fast.h|4 +
 7 files changed, 492 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/boot/dts/mpc832x_rdb.dts 
b/arch/powerpc/boot/dts/mpc832x_rdb.dts
index 388c8a7..333408c 100644
--- a/arch/powerpc/boot/dts/mpc832x_rdb.dts
+++ b/arch/powerpc/boot/dts/mpc832x_rdb.dts
@@ -105,6 +105,17 @@
device_type = par_io;
num-ports = 7;
 
+   ucc1pio:[EMAIL PROTECTED] {
+   pio-map = 
+   /* port  pin  dir  open_drain  assignment  has_irq */
+   0  e  2  0  1  0/* CLK11 */
+   3 16  1  0  2  0/* BRG9 */
+   3 1b  1  0  2  0/* BRG3 */
+   0  0  3  0  2  0/* TDMATxD0 */
+   0  4  3  0  2  0/* TDMARxD0 */
+   3 1b  2  0  1  0;  /* CLK1 */
+   };
+
ucc2pio:[EMAIL PROTECTED] {
pio-map = 
/* port  pin  dir  open_drain  assignment  has_irq */
@@ -169,6 +180,36 @@
};
};
 
+   clocks {
+   compatible = fsl,cpm-clocks;
+   /* clock freqs in Hz(for CLK1~CLK24).
+* CLK11 is 1024KHz,
+* all other clocks unused
+* #clock-cells define number of cells
+* used by the clock-frequency.
+* right now only #clock cells=1 is
+* implemented. Provision is there to
+* handle frequencies 4Gig
+*/
+   #clock-cells = 1;
+   clock-frequency = 0 0 0 0 0 0
+  0 0 0 0 d#1024000 0
+  0 0 0 0 0 0
+  0 0 0 0 0 0;
+   };
+
+   [EMAIL PROTECTED] {
+   compatible = fsl,cpm-brg;
+   /* input clock sources for all the 16 BRGs.
+* 1-24 for CLK1 to CLK24.
+* BRG9 uses CLK11,BRG1 and BRG2-8 use
+* the QE clock.
+*/
+   fsl,brg-sources = 0 0 0 0 0 0 0 0
+  b 0 0 0 0 0 0 0;
+   reg = 640 7f;
+   };
+
[EMAIL PROTECTED] {
device_type = spi;
compatible = fsl_spi;
@@ -187,6 +228,23 @@
mode = cpu;
};
 
+   [EMAIL PROTECTED] {
+   device_type = tdm;
+   compatible = fsl,ucc-tdm;
+   model = UCC;
+   device-id = 1;
+   fsl,tdm-num = 1;
+   fsl,si-num = 1;
+   fsl,tdm-tx-clk = CLK1;
+   fsl,tdm-rx-clk = CLK1;
+   fsl,tdm-tx-sync = BRG9;
+   fsl,tdm-rx-sync = BRG9;
+   reg = 2000 200;
+   interrupts = 20;
+   interrupt-parent = qeic;
+   pio-handle = ucc1pio;
+   };
+
[EMAIL PROTECTED] {
device_type = network;
compatible = ucc_geth;
diff --git a/arch/powerpc/sysdev/qe_lib/qe.c b/arch/powerpc/sysdev/qe_lib/qe.c
index 1df3b4a..abcf0b4 100644
--- a/arch/powerpc/sysdev/qe_lib/qe.c
+++ b/arch/powerpc/sysdev/qe_lib/qe.c
@@ -149,22 +149,116 @@ EXPORT_SYMBOL(qe_issue_cmd);
  */
 static unsigned int brg_clk = 0;
 
-unsigned int get_brg_clk(void)
+u32 get_brg_clk(enum qe_clock brgclk, enum qe_clock *brg_source)
 {
-   struct device_node *qe;
-

[PATCH 3/3] Modified Documentation to explain dts entries for UCC TDM driver.

From: Poonam Aggrwal [EMAIL PROTECTED]

Modified Documentation to explain new properties introduced for UCC TDM 
driver. Also two new nodes have been added brg and clocks to configure 
a BRG from device tree.

Signed-off-by: Poonam Aggrwal [EMAIL PROTECTED]
Signed-off-by: Ashish Kalra [EMAIL PROTECTED]
Signed-off-by: Kim Phillips [EMAIL PROTECTED]
Signed-off-by: Michael Barkowski [EMAIL PROTECTED]
---
 Documentation/powerpc/booting-without-of.txt |   96 +-
 1 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/Documentation/powerpc/booting-without-of.txt 
b/Documentation/powerpc/booting-without-of.txt
index e9a3cb1..94a6b4b 100644
--- a/Documentation/powerpc/booting-without-of.txt
+++ b/Documentation/powerpc/booting-without-of.txt
@@ -1613,8 +1613,8 @@ platforms are moved over to use the flattened-device-tree 
model.
 
Required properties:
- device_type : should be network, hldc, uart, transparent
-bisync or atm.
-   - compatible : could be ucc_geth or fsl_atm and so on.
+bisync, atm or tdm.
+   - compatible : could be ucc_geth, fsl_atm or fsl,ucc_tdm and so on.
- model : should be UCC.
- device-id : the ucc number(1-8), corresponding to UCCx in UM.
- reg : Offset and length of the register set for the device
@@ -1666,7 +1666,44 @@ platforms are moved over to use the 
flattened-device-tree model.
pio-handle = 140001;
};
 
+   Required properties for tdm device_type:
+   - instead of tx-clock and rx-clock following clock properties are
+ required:
+   - fsl,tdm-tx-clk : This property selects the TX clock source for TDM
+   from a bank of clocks.
+   - fsl,tdm-rx-clk : This property selects the RX clock source for TDM
+   from a bank of clocks.
+   - fsl,tdm-tx-sync : This property selects the TX Frame sync source
+   for TDM from a bank of clocks.
+   - fsl,tdm-rx-sync : This property selects the TX Frame sync source
+   for TDM from a bank of clocks.
+
+   All the above mentioned properties are string type with possible
+   values
+   CLK1, CLK2, CLK3...CLK24 and so on
+   BRG1, BRG2, BRG3...BRG16 and so on
+
+  - fsl,tdm-num : TDM to be used (1,2,3 or 4 for TDMA TDMB TDMC TDMD)
+  - fsl,si-num :  Serial Interface to be used.
 
+   Example:
+   [EMAIL PROTECTED] {
+   device_type = tdm;
+   compatible = fsl,ucc-tdm;
+   model = UCC;
+   device-id = 1;
+   fsl,tdm-num = 1;
+   fsl,si-num = 1;
+   fsl,tdm-tx-clk = CLK1;
+   fsl,tdm-rx-clk = CLK1;
+   fsl,tdm-tx-sync = BRG9;
+   fsl,tdm-rx-sync = BRG9;
+   reg = 2000 200;
+   interrupts = 20;
+   interrupt-parent = qeic;
+   pio-handle = ucc1pio;
+   };
+
v) Parallel I/O Ports
 
This node configures Parallel I/O ports for CPUs with QE support.
@@ -1772,6 +1809,61 @@ platforms are moved over to use the 
flattened-device-tree model.
};
};
 
+  viii) Clocks (clocks)
+   This node specifies the frequency values for all the external clocks
+   viz CLK1 to CLK24 in Hz.
+
+  Required Properties:
+  - compatible : should be fsl,cpm-clocks.
+  - #clock-cells : It specifies the number of cells occupied by clock-frequency
+property. Currently #clock-cells = 1 is only supported and implemented.
+This property is kept for future in case we need frequencies higher than
+4 GHz.
+  - clock-frequency : It is a list of u32 values to represent the frequency
+of each external clock(CLK1 to CLK24) in Hz.Each entry occupies
+number of cells specified by #clock-cells property(1 for now).
+
+  Example:
+
+   clocks {
+   compatible = fsl,cpm-clocks;
+   #clock-cells = 1;
+   /* clock freqs in Hz(for CLK1~CLK24).
+* CLK11 is 1024KHz,
+* all other clocks unused
+*/
+   clock-frequency = 0 0 0 0 0 0
+  0 0 0 0 0 d#1024000 0
+  0 0 0 0 0 0
+  0 0 0 0 0 0;
+   };
+
+  ix) Baud Rate Generator (BRG)
+
+  Required properties:
+  - compatible : shpuld be fsl,cpm-brg
+  - fsl,brg-sources : define the input clock for all 16 BRGs. The input
+clock source could be 1 to 24 for CLK1 to CLK24. Zero means that the
+particular BRG will be driven by QE clock(BRGCLK).
+  - reg : This property defines the address and size of the memory-mapped
+registers of the BRG.
+
+  Example:
+
+   [EMAIL PROTECTED] {
+   compatible = fsl,qe-brg;
+   /* input clock sources for all the 16 BRGs.
+* 1-24 for CLK1 to CLK24.
+* BRG9 uses CLK11 others use

Re: 2.6.24-rc4-mm1

On Wed, 5 Dec 2007, Andrew Morton wrote:

 On Thu, 06 Dec 2007 17:59:37 +1100 Reuben Farrelly [EMAIL PROTECTED] wrote:
 
  This non fatal oops which I have just noticed may be related to this change 
  then 
  - certainly looks networking related.
 
 yep, but it isn't e1000.  It's core TCP.
 
  WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert()
  Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1
 
 Ilpo, Reuben's kernel is talking to you ;)

...Please try the patch below. Andrew, this probably fixes your problem 
(the packets = tp-packets_out) as well.

Dave, please include this one to net-2.6.25.


-- 
 i.

--
[PATCH] [TCP]: Fix fack_count miscountings (multiple places)

1) Fack_count is set incorrectly if the highest sent skb is
already sacked (the skb-prev won't return it because it's on
the other list already). These manifest as fackets_out counting
error later on, the second-order effects are very hard to track,
so it may fix all out-standing TCP bug reports.

2) Prev == NULL check was wrong way around

3) Last skb's fack count was incorrectly skipped while() {} loop

Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
---
 include/net/tcp.h |   22 --
 1 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 9dbed0b..11a7e3e 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1337,10 +1337,20 @@ static inline struct sk_buff *tcp_send_head(struct sock 
*sk)
 static inline void tcp_advance_send_head(struct sock *sk, struct sk_buff *skb)
 {
struct sk_buff *prev = tcp_write_queue_prev(sk, skb);
+   unsigned int fc = 0;
+
+   if (prev == (struct sk_buff *)sk-sk_write_queue)
+   prev = NULL;
+   else if (!tcp_skb_adjacent(sk, prev, skb))
+   prev = NULL;
 
-   if (prev != (struct sk_buff *)sk-sk_write_queue)
-   TCP_SKB_CB(skb)-fack_count = TCP_SKB_CB(prev)-fack_count +
- tcp_skb_pcount(prev);
+   if ((prev == NULL)  !__tcp_write_queue_empty(sk, TCP_WQ_SACKED))
+   prev = __tcp_write_queue_tail(sk, TCP_WQ_SACKED);
+
+   if (prev != NULL)
+   fc = TCP_SKB_CB(prev)-fack_count + tcp_skb_pcount(prev);
+
+   TCP_SKB_CB(skb)-fack_count = fc;
 
sk-sk_send_head = tcp_write_queue_next(sk, skb);
if (sk-sk_send_head == (struct sk_buff *)sk-sk_write_queue)
@@ -1464,7 +1474,7 @@ static inline struct sk_buff 
*__tcp_reset_fack_counts(struct sock *sk,
 {
unsigned int fc = 0;
 
-   if (prev == NULL)
+   if (prev != NULL)
fc = TCP_SKB_CB(*prev)-fack_count + tcp_skb_pcount(*prev);
 
BUG_ON((*prev != NULL)  !tcp_skb_adjacent(sk, *prev, skb));
@@ -1521,7 +1531,7 @@ static inline void tcp_reset_fack_counts(struct sock *sk, 
struct sk_buff *inskb)
skb[otherq] = prev-next;
}
 
-   while (skb[queue] != __tcp_write_queue_tail(sk, queue)) {
+   do {
/* Lazy find for the other queue */
if (skb[queue] == NULL) {
skb[queue] = tcp_write_queue_find(sk, 
TCP_SKB_CB(prev)-seq,
@@ -1535,7 +1545,7 @@ static inline void tcp_reset_fack_counts(struct sock *sk, 
struct sk_buff *inskb)
break;
 
queue ^= TCP_WQ_SACKED;
-   }
+   } while (skb[queue] != __tcp_write_queue_tail(sk, queue));
 }
 
 static inline void __tcp_insert_write_queue_after(struct sk_buff *skb,
-- 
1.5.0.6

[PATCH] [TCP]: Bind fackets_out state to highest_sack more tightly


Added checks will catch most of the errors if the current
complex fack_count counting logic is flawed somewhere.

Fackets_out should always be advancable if highest_sack is too
because the fackets_out is nowadays accurate (and obviously it
must be smaller than packets_out).

Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
---
 net/ipv4/tcp_input.c |   14 +-
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9499a12..23b2a34 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1270,24 +1270,28 @@ static int tcp_sacktag_one(struct sk_buff *skb, struct 
sock *sk,
}
}
 
-   if (!before(TCP_SKB_CB(skb)-seq, tcp_highest_sack_seq(tp)))
+   fack_count += tcp_skb_pcount(skb);
+   if (!before(TCP_SKB_CB(skb)-seq, tcp_highest_sack_seq(tp))) {
+   WARN_ON((fack_count = tp-fackets_out) ||
+   (fack_count  tp-packets_out));
+
tcp_advance_highest_sack(sk, skb);
+   tp-fackets_out = fack_count;
+   } else
+   WARN_ON(fack_count  tp-fackets_out);
+
tcp_write_queue_requeue(skb, sk, TCP_WQ_SACKED);
 
TCP_SKB_CB(skb)-sacked |= TCPCB_SACKED_ACKED;
flag |= FLAG_DATA_SACKED;
tp-sacked_out += tcp_skb_pcount(skb);
 
-   fack_count += tcp_skb_pcount(skb);
-
/* Lost marker hint past SACKed? Tweak RFC3517 cnt */
if (!tcp_is_fack(tp)  (tp-lost_skb_hint != NULL) 
before(TCP_SKB_CB(skb)-seq,
   TCP_SKB_CB(tp-lost_skb_hint)-seq))
tp-lost_cnt_hint += tcp_skb_pcount(skb);
 
-   if (fack_count  tp-fackets_out)
-   tp-fackets_out = fack_count;
}
 
/* D-SACK. We can detect redundant retransmission in S|R and plain R
-- 
1.5.0.6

Re: [PATCH] [TCP]: Bind fackets_out state to highest_sack more tightly

From: Ilpo_Järvinen [EMAIL PROTECTED]
Date: Mon, 10 Dec 2007 14:27:24 +0200 (EET)

 Added checks will catch most of the errors if the current
 complex fack_count counting logic is flawed somewhere.

 Fackets_out should always be advancable if highest_sack is too
 because the fackets_out is nowadays accurate (and obviously it
 must be smaller than packets_out).

 Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]

Applied to net-2.6.25, thanks!
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [TCP]: Bind fackets_out state to highest_sack more tightly

On Mon, 10 Dec 2007, David Miller wrote:

 From: Ilpo_Järvinen [EMAIL PROTECTED]
 Date: Mon, 10 Dec 2007 14:27:24 +0200 (EET)

  Added checks will catch most of the errors if the current
  complex fack_count counting logic is flawed somewhere.

  Fackets_out should always be advancable if highest_sack is too
  because the fackets_out is nowadays accurate (and obviously it
  must be smaller than packets_out).

  Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]

 Applied to net-2.6.25, thanks!

Please get the fack_count fix as well from the mm1 thread before my 
mailbox gets filled with stacktraces :-) :
  http://marc.info/?l=linux-netdevm=119728952018975w=2

-- 
 i.

Re: [1/4] DST: Distributed storage documentation.

On Dec 10, 2007 12:47 PM, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 diff --git a/Documentation/dst/sysfs.txt b/Documentation/dst/sysfs.txt
 new file mode 100644
 index 000..79d79dc
 --- /dev/null
 +++ b/Documentation/dst/sysfs.txt
 @@ -0,0 +1,30 @@
 +This file describes sysfs files created for each storage.
 +
 +1. Per-storage files.
 +Each storage has its own dir /sysfs/devices/$storage_name,

It's always /sys/devices/.

 +which contains following files:
 +
 +alg - contains name of the algorithm used to created given storage
 +name - name of the storage
 +nodes - map of the storage (list of nodes and their sizes and starts)
 +remove_all_nodes - writable file which allows to remove all nodes from given
 +   storage
 +n-$start-$cookie - per node directory, where
 +   $start - start of the given node in sectors,
 +   $cookie - unique node's id used by DST
 +
 +2. Per-node files.
 +Node's files are located in /sysfs/devices/$storage_name/n-$start-$cookie
 +directory, described above.

To which class or bus do the devices you create belong? Care to show a
tree or ls -la of the device?

Kay
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/4] DST: Distributed storage documentation.

On Mon, Dec 10, 2007 at 01:51:43PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
 On Dec 10, 2007 12:47 PM, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
  diff --git a/Documentation/dst/sysfs.txt b/Documentation/dst/sysfs.txt
  new file mode 100644
  index 000..79d79dc
  --- /dev/null
  +++ b/Documentation/dst/sysfs.txt
  @@ -0,0 +1,30 @@
  +This file describes sysfs files created for each storage.
  +
  +1. Per-storage files.
  +Each storage has its own dir /sysfs/devices/$storage_name,
 
 It's always /sys/devices/.

I meant that for each new device, it will be placed into
/sys/devices/its_name, but it can also be accessed via
/sys/bus/dst/devices/

  +which contains following files:
  +
  +alg - contains name of the algorithm used to created given storage
  +name - name of the storage
  +nodes - map of the storage (list of nodes and their sizes and starts)
  +remove_all_nodes - writable file which allows to remove all nodes from 
  given
  +   storage
  +n-$start-$cookie - per node directory, where
  +   $start - start of the given node in sectors,
  +   $cookie - unique node's id used by DST
  +
  +2. Per-node files.
  +Node's files are located in /sysfs/devices/$storage_name/n-$start-$cookie
  +directory, described above.
 
 To which class or bus do the devices you create belong? Care to show a
 tree or ls -la of the device?

It is 'dst' bus.

uganda:~/codes# ls -la /sys/devices/staorge/
total 0
drwxr-xr-x 4 root root0 2007-12-10 11:46 .
drwxr-xr-x 9 root root0 2007-12-10 11:46 ..
-r--r--r-- 1 root root 4096 2007-12-10 11:46 alg
lrwxrwxrwx 1 root root0 2007-12-10 11:46 bus - ../../bus/dst
drwxr-xr-x 3 root root0 2007-12-10 11:46 n-0-81003e24117
-r--r--r-- 1 root root 4096 2007-12-10 11:46 name
-r--r--r-- 1 root root 4096 2007-12-10 11:46 nodes
drwxr-xr-x 2 root root0 2007-12-10 11:46 power
-rw-r--r-- 1 root root 4096 2007-12-10 11:46 remove_all_nodes
lrwxrwxrwx 1 root root0 2007-12-10 11:46 subsystem - ../../bus/dst
-rw-r--r-- 1 root root 4096 2007-12-10 11:46 uevent
uganda:~/codes# ls -l /sys/bus/dst/
total 0
drwxr-xr-x 2 root root0 2007-12-10 09:52 devices
drwxr-xr-x 2 root root0 2007-12-10 09:52 drivers
-rw-r--r-- 1 root root 4096 2007-12-10 11:46 drivers_autoprobe
--w--- 1 root root 4096 2007-12-10 11:46 drivers_probe


 Kay

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/4] [TFRC]: Revised Loss Intervals Patches (macro-less, new swap function)

This revision updates earlier patches, following discussion,
and adds one additional cleanup patch at the end.


Patch #1: Revision of initialisation patch; fixed calling __exit function 
  from __init function - identified by Arnaldo.

Patch #2: Revision - re-converted tfrc_rx_hist_entry() back to inline,
  following discussion with Arnaldo.

Patch #3: Reworked - loss intervals database.  Individual changes:
  - replaced tfrc_rx_hist_swap() with routine suggested by Arnaldo;
  - replaced all access macros with inlines or in-place(s);
  - replaced LIH_INDEX also with inline instead of macro.
  
Patch #4: Removes redundant debugging output from syslog.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] [PATCH v2] [TFRC]: Loss interval code needs the macros/inlines that were moved

This moves the inlines (which were previously declared as macros) back into 
packet_history.h since
the loss detection code needs to be able to read entries from the RX history in 
order to create the
relevant loss entries: it needs at least tfrc_rx_hist_loss_prev() and 
tfrc_rx_hist_last_rcv(), which
in turn require the definition of the other inlines (macros).

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---
 net/dccp/ccids/lib/packet_history.c |   35 ---
 net/dccp/ccids/lib/packet_history.h |   35 +++
 2 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/net/dccp/ccids/lib/packet_history.c 
b/net/dccp/ccids/lib/packet_history.c
index 1346045..22114c6 100644
--- a/net/dccp/ccids/lib/packet_history.c
+++ b/net/dccp/ccids/lib/packet_history.c
@@ -151,41 +151,6 @@ void tfrc_rx_packet_history_exit(void)
}
 }
 
-/**
- * tfrc_rx_hist_index - index to reach n-th entry after loss_start
- */
-static inline u8 tfrc_rx_hist_index(const struct tfrc_rx_hist *h, const u8 n)
-{
-   return (h-loss_start + n)  TFRC_NDUPACK;
-}
-
-/**
- * tfrc_rx_hist_last_rcv - entry with highest-received-seqno so far
- */
-static inline struct tfrc_rx_hist_entry *
-   tfrc_rx_hist_last_rcv(const struct tfrc_rx_hist *h)
-{
-   return h-ring[tfrc_rx_hist_index(h, h-loss_count)];
-}
-
-/**
- * tfrc_rx_hist_entry - return the n-th history entry after loss_start
- */
-static inline struct tfrc_rx_hist_entry *
-   tfrc_rx_hist_entry(const struct tfrc_rx_hist *h, const u8 n)
-{
-   return h-ring[tfrc_rx_hist_index(h, n)];
-}
-
-/**
- * tfrc_rx_hist_loss_prev - entry with highest-received-seqno before loss was 
detected
- */
-static inline struct tfrc_rx_hist_entry *
-   tfrc_rx_hist_loss_prev(const struct tfrc_rx_hist *h)
-{
-   return h-ring[h-loss_start];
-}
-
 /* has the packet contained in skb been seen before? */
 int tfrc_rx_hist_duplicate(struct tfrc_rx_hist *h, struct sk_buff *skb)
 {
diff --git a/net/dccp/ccids/lib/packet_history.h 
b/net/dccp/ccids/lib/packet_history.h
index 3dfd182..e58b0fc 100644
--- a/net/dccp/ccids/lib/packet_history.h
+++ b/net/dccp/ccids/lib/packet_history.h
@@ -84,6 +84,41 @@ struct tfrc_rx_hist {
 #define rtt_sample_prev  loss_start
 };
 
+/**
+ * tfrc_rx_hist_index - index to reach n-th entry after loss_start
+ */
+static inline u8 tfrc_rx_hist_index(const struct tfrc_rx_hist *h, const u8 n)
+{
+   return (h-loss_start + n)  TFRC_NDUPACK;
+}
+
+/**
+ * tfrc_rx_hist_last_rcv - entry with highest-received-seqno so far
+ */
+static inline struct tfrc_rx_hist_entry *
+   tfrc_rx_hist_last_rcv(const struct tfrc_rx_hist *h)
+{
+   return h-ring[tfrc_rx_hist_index(h, h-loss_count)];
+}
+
+/**
+ * tfrc_rx_hist_entry - return the n-th history entry after loss_start
+ */
+static inline struct tfrc_rx_hist_entry *
+   tfrc_rx_hist_entry(const struct tfrc_rx_hist *h, const 
u8 n)
+{
+   return h-ring[tfrc_rx_hist_index(h, n)];
+}
+
+/**
+ * tfrc_rx_hist_loss_prev - entry with highest-received-seqno before loss was 
detected
+ */
+static inline struct tfrc_rx_hist_entry *
+   tfrc_rx_hist_loss_prev(const struct tfrc_rx_hist *h)
+{
+   return h-ring[h-loss_start];
+}
+
 extern void tfrc_rx_hist_add_packet(struct tfrc_rx_hist *h,
const struct sk_buff *skb, const u32 ndp);
 
-- 
1.5.3.GIT

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] [CCID3]: Redundant debugging output / documentation

Each time feedback is sent two lines are printed:

ccid3_hc_rx_send_feedback: client ... - entry
ccid3_hc_rx_send_feedback: Interval ...usec, X_recv=..., 1/p=...

The first line is redundant and thus removed.

Further, documentation of ccid3_hc_rx_sock (capitalisation) is made consistent.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---
 net/dccp/ccids/ccid3.c |2 --
 net/dccp/ccids/ccid3.h |4 ++--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c
index 60fcb31..b92069b 100644
--- a/net/dccp/ccids/ccid3.c
+++ b/net/dccp/ccids/ccid3.c
@@ -685,8 +685,6 @@ static void ccid3_hc_rx_send_feedback(struct sock *sk,
ktime_t now;
s64 delta = 0;
 
-   ccid3_pr_debug(%s(%p) - entry \n, dccp_role(sk), sk);
-
if (unlikely(hcrx-ccid3hcrx_state == TFRC_RSTATE_TERM))
return;
 
diff --git a/net/dccp/ccids/ccid3.h b/net/dccp/ccids/ccid3.h
index 3c33dc6..6ceeb80 100644
--- a/net/dccp/ccids/ccid3.h
+++ b/net/dccp/ccids/ccid3.h
@@ -135,9 +135,9 @@ enum ccid3_hc_rx_states {
  *
  *  @ccid3hcrx_x_recv  -  Receiver estimate of send rate (RFC 3448 4.3)
  *  @ccid3hcrx_rtt  -  Receiver estimate of rtt (non-standard)
- *  @ccid3hcrx_p  -  current loss event rate (RFC 3448 5.4)
+ *  @ccid3hcrx_p  -  Current loss event rate (RFC 3448 5.4)
  *  @ccid3hcrx_last_counter  -  Tracks window counter (RFC 4342, 8.1)
- *  @ccid3hcrx_state  -  receiver state, one of %ccid3_hc_rx_states
+ *  @ccid3hcrx_state  -  Receiver state, one of %ccid3_hc_rx_states
  *  @ccid3hcrx_bytes_recv  -  Total sum of DCCP payload bytes
  *  @ccid3hcrx_tstamp_last_feedback  -  Time at which last feedback was sent
  *  @ccid3hcrx_tstamp_last_ack  -  Time at which last feedback was sent
-- 
1.5.3.GIT

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] [PATCH v2] [TFRC]: Put RX/TX initialisation into tfrc.c

This separates RX/TX initialisation and puts all packet history / loss intervals
initialisation into tfrc.c.
The organisation is uniform: slab declaration - {rx,tx}_init() - 
{rx,tx}_exit()

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---
 net/dccp/ccids/lib/packet_history.c |   68 --
 net/dccp/ccids/lib/tfrc.c   |   31 
 2 files changed, 55 insertions(+), 44 deletions(-)

diff --git a/net/dccp/ccids/lib/packet_history.c 
b/net/dccp/ccids/lib/packet_history.c
index af44082..727b17d 100644
--- a/net/dccp/ccids/lib/packet_history.c
+++ b/net/dccp/ccids/lib/packet_history.c
@@ -57,6 +57,22 @@ struct tfrc_tx_hist_entry {
  */
 static struct kmem_cache *tfrc_tx_hist_slab;
 
+int __init tfrc_tx_packet_history_init(void)
+{
+   tfrc_tx_hist_slab = kmem_cache_create(tfrc_tx_hist,
+ sizeof(struct tfrc_tx_hist_entry),
+ 0, SLAB_HWCACHE_ALIGN, NULL);
+   return tfrc_tx_hist_slab == NULL ? -ENOBUFS : 0;
+}
+
+void tfrc_tx_packet_history_exit(void)
+{
+   if (tfrc_tx_hist_slab != NULL) {
+   kmem_cache_destroy(tfrc_tx_hist_slab);
+   tfrc_tx_hist_slab = NULL;
+   }
+}
+
 static struct tfrc_tx_hist_entry *
tfrc_tx_hist_find_entry(struct tfrc_tx_hist_entry *head, u64 seqno)
 {
@@ -119,6 +135,22 @@ EXPORT_SYMBOL_GPL(tfrc_tx_hist_rtt);
  */
 static struct kmem_cache *tfrc_rx_hist_slab;
 
+int __init tfrc_rx_packet_history_init(void)
+{
+   tfrc_rx_hist_slab = kmem_cache_create(tfrc_rxh_cache,
+ sizeof(struct tfrc_rx_hist_entry),
+ 0, SLAB_HWCACHE_ALIGN, NULL);
+   return tfrc_rx_hist_slab == NULL ? -ENOBUFS : 0;
+}
+
+void tfrc_rx_packet_history_exit(void)
+{
+   if (tfrc_rx_hist_slab != NULL) {
+   kmem_cache_destroy(tfrc_rx_hist_slab);
+   tfrc_rx_hist_slab = NULL;
+   }
+}
+
 /**
  * tfrc_rx_hist_index - index to reach n-th entry after loss_start
  */
@@ -316,39 +348,3 @@ keep_ref_for_next_time:
return sample;
 }
 EXPORT_SYMBOL_GPL(tfrc_rx_hist_sample_rtt);
-
-__init int packet_history_init(void)
-{
-   tfrc_tx_hist_slab = kmem_cache_create(tfrc_tx_hist,
- sizeof(struct 
tfrc_tx_hist_entry), 0,
- SLAB_HWCACHE_ALIGN, NULL);
-   if (tfrc_tx_hist_slab == NULL)
-   goto out_err;
-
-   tfrc_rx_hist_slab = kmem_cache_create(tfrc_rx_hist,
- sizeof(struct 
tfrc_rx_hist_entry), 0,
- SLAB_HWCACHE_ALIGN, NULL);
-   if (tfrc_rx_hist_slab == NULL)
-   goto out_free_tx;
-
-   return 0;
-
-out_free_tx:
-   kmem_cache_destroy(tfrc_tx_hist_slab);
-   tfrc_tx_hist_slab = NULL;
-out_err:
-   return -ENOBUFS;
-}
-
-void packet_history_exit(void)
-{
-   if (tfrc_tx_hist_slab != NULL) {
-   kmem_cache_destroy(tfrc_tx_hist_slab);
-   tfrc_tx_hist_slab = NULL;
-   }
-
-   if (tfrc_rx_hist_slab != NULL) {
-   kmem_cache_destroy(tfrc_rx_hist_slab);
-   tfrc_rx_hist_slab = NULL;
-   }
-}
diff --git a/net/dccp/ccids/lib/tfrc.c b/net/dccp/ccids/lib/tfrc.c
index 3a7a183..20763fa 100644
--- a/net/dccp/ccids/lib/tfrc.c
+++ b/net/dccp/ccids/lib/tfrc.c
@@ -14,27 +14,42 @@ module_param(tfrc_debug, bool, 0444);
 MODULE_PARM_DESC(tfrc_debug, Enable debug messages);
 #endif
 
+extern int  tfrc_tx_packet_history_init(void);
+extern void tfrc_tx_packet_history_exit(void);
+extern int  tfrc_rx_packet_history_init(void);
+extern void tfrc_rx_packet_history_exit(void);
+
 extern int  dccp_li_init(void);
 extern void dccp_li_exit(void);
-extern int packet_history_init(void);
-extern void packet_history_exit(void);
 
 static int __init tfrc_module_init(void)
 {
int rc = dccp_li_init();
 
-   if (rc == 0) {
-   rc = packet_history_init();
-   if (rc != 0)
-   dccp_li_exit();
-   }
+   if (rc)
+   goto out;
+
+   rc = tfrc_tx_packet_history_init();
+   if (rc)
+   goto out_free_loss_intervals;
 
+   rc = tfrc_rx_packet_history_init();
+   if (rc)
+   goto out_free_tx_history;
+   return 0;
+
+out_free_tx_history:
+   tfrc_tx_packet_history_exit();
+out_free_loss_intervals:
+   dccp_li_exit();
+out:
return rc;
 }
 
 static void __exit tfrc_module_exit(void)
 {
-   packet_history_exit();
+   tfrc_rx_packet_history_exit();
+   tfrc_tx_packet_history_exit();
dccp_li_exit();
 }
 
-- 
1.5.3.GIT

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] [PATCH v2] [TFRC]: Ringbuffer to track loss interval history

A ringbuffer-based implementation of loss interval history is easier to
maintain, allocate, and update.

The `swap' routine to keep the RX history sorted is due to and was written
by Arnaldo Carvalho de Melo, simplifying an earlier macro-based variant.

Details:
 * access to the Loss Interval Records via macro wrappers (with safety checks);
 * simplified, on-demand allocation of entries (no extra memory consumption on
   lossless links); cache allocation is local to the module / exported as 
service;
 * provision of RFC-compliant algorithm to re-compute average loss interval;
 * provision of comprehensive, new loss detection algorithm
- support for all cases of loss, including re-ordered/duplicate packets;
- waiting for NDUPACK=3 packets to fill the hole;
- updating loss records when a late-arriving packet fills a hole.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
Signed-off-by: Ian McDonald [EMAIL PROTECTED]
---
 net/dccp/ccids/lib/loss_interval.c  |  161 +++-
 net/dccp/ccids/lib/loss_interval.h  |   56 ++-
 net/dccp/ccids/lib/packet_history.c |  202 +++
 net/dccp/ccids/lib/packet_history.h |   11 ++-
 net/dccp/ccids/lib/tfrc.h   |3 +
 5 files changed, 423 insertions(+), 10 deletions(-)

diff --git a/net/dccp/ccids/lib/loss_interval.c 
b/net/dccp/ccids/lib/loss_interval.c
index c0a933a..39980d1 100644
--- a/net/dccp/ccids/lib/loss_interval.c
+++ b/net/dccp/ccids/lib/loss_interval.c
@@ -1,6 +1,7 @@
 /*
  *  net/dccp/ccids/lib/loss_interval.c
  *
+ *  Copyright (c) 2007   The University of Aberdeen, Scotland, UK
  *  Copyright (c) 2005-7 The University of Waikato, Hamilton, New Zealand.
  *  Copyright (c) 2005-7 Ian McDonald [EMAIL PROTECTED]
  *  Copyright (c) 2005 Arnaldo Carvalho de Melo [EMAIL PROTECTED]
@@ -10,12 +11,7 @@
  *  the Free Software Foundation; either version 2 of the License, or
  *  (at your option) any later version.
  */
-
-#include linux/module.h
 #include net/sock.h
-#include ../../dccp.h
-#include loss_interval.h
-#include packet_history.h
 #include tfrc.h
 
 #define DCCP_LI_HIST_IVAL_F_LENGTH  8
@@ -27,6 +23,54 @@ struct dccp_li_hist_entry {
u32  dccplih_interval;
 };
 
+static struct kmem_cache  *tfrc_lh_slab  __read_mostly;
+/* Loss Interval weights from [RFC 3448, 5.4], scaled by 10 */
+static const int tfrc_lh_weights[NINTERVAL] = { 10, 10, 10, 10, 8, 6, 4, 2 };
+
+/* implements LIFO semantics on the array */
+static inline u8 LIH_INDEX(const u8 ctr)
+{
+   return (LIH_SIZE - 1 - (ctr % LIH_SIZE));
+}
+
+/* the `counter' index always points at the next entry to be populated */
+static inline struct tfrc_loss_interval *tfrc_lh_peek(struct tfrc_loss_hist 
*lh)
+{
+   return lh-counter ? lh-ring[LIH_INDEX(lh-counter - 1)] : NULL;
+}
+
+/* given i with 0 = i = k, return I_i as per the rfc3448bis notation */
+static inline u32 tfrc_lh_get_interval(struct tfrc_loss_hist *lh, const u8 i)
+{
+   BUG_ON(i = lh-counter);
+   return lh-ring[LIH_INDEX(lh-counter - i - 1)]-li_length;
+}
+
+/*
+ * On-demand allocation and de-allocation of entries
+ */
+static struct tfrc_loss_interval *tfrc_lh_demand_next(struct tfrc_loss_hist 
*lh)
+{
+   if (lh-ring[LIH_INDEX(lh-counter)] == NULL)
+   lh-ring[LIH_INDEX(lh-counter)] = 
kmem_cache_alloc(tfrc_lh_slab,
+   GFP_ATOMIC);
+   return lh-ring[LIH_INDEX(lh-counter)];
+}
+
+void tfrc_lh_cleanup(struct tfrc_loss_hist *lh)
+{
+   if (!tfrc_lh_is_initialised(lh))
+   return;
+
+   for (lh-counter = 0; lh-counter  LIH_SIZE; lh-counter++)
+   if (lh-ring[LIH_INDEX(lh-counter)] != NULL) {
+   kmem_cache_free(tfrc_lh_slab,
+   lh-ring[LIH_INDEX(lh-counter)]);
+   lh-ring[LIH_INDEX(lh-counter)] = NULL;
+   }
+}
+EXPORT_SYMBOL_GPL(tfrc_lh_cleanup);
+
 static struct kmem_cache *dccp_li_cachep __read_mostly;
 
 static inline struct dccp_li_hist_entry *dccp_li_hist_entry_new(const gfp_t 
prio)
@@ -98,6 +142,65 @@ u32 dccp_li_hist_calc_i_mean(struct list_head *list)
 
 EXPORT_SYMBOL_GPL(dccp_li_hist_calc_i_mean);
 
+static void tfrc_lh_calc_i_mean(struct tfrc_loss_hist *lh)
+{
+   u32 i_i, i_tot0 = 0, i_tot1 = 0, w_tot = 0;
+   int i, k = tfrc_lh_length(lh) - 1; /* k is as in rfc3448bis, 5.4 */
+
+   for (i=0; i = k; i++) {
+   i_i = tfrc_lh_get_interval(lh, i);
+
+   if (i  k) {
+   i_tot0 += i_i * tfrc_lh_weights[i];
+   w_tot  += tfrc_lh_weights[i];
+   }
+   if (i  0)
+   i_tot1 += i_i * tfrc_lh_weights[i-1];
+   }
+
+   BUG_ON(w_tot == 0);
+   lh-i_mean = max(i_tot0, i_tot1) / w_tot;
+}
+
+/**
+ * tfrc_lh_update_i_mean  -  Update the `open' loss interval I_0
+ * For recomputing p: returns

Re: [1/4] DST: Distributed storage documentation.

On Mon, 2007-12-10 at 15:58 +0300, Evgeniy Polyakov wrote:
 On Mon, Dec 10, 2007 at 01:51:43PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
 wrote:
  On Dec 10, 2007 12:47 PM, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
   diff --git a/Documentation/dst/sysfs.txt b/Documentation/dst/sysfs.txt
   new file mode 100644
   index 000..79d79dc
   --- /dev/null
   +++ b/Documentation/dst/sysfs.txt
   @@ -0,0 +1,30 @@
   +This file describes sysfs files created for each storage.
   +
   +1. Per-storage files.
   +Each storage has its own dir /sysfs/devices/$storage_name,
  
  It's always /sys/devices/.
 
 I meant that for each new device, it will be placed into
 /sys/devices/its_name, but it can also be accessed via
 /sys/bus/dst/devices/

Still, it looks like a path. :)

Please don't reference any device directly with a /sys/devices/ path.
You have to use the subsystem links to the devices
in /sys/bus/dst/devices/. Devices are free to move around
in /sys/devices, even during runtime. Yours don't do, but anyway, please
remove all mentioning of direct access to /sys/devices/.

Btw, where is the top-level /sys/devices/storage/ coming from? I don't
see that in the code. We don't accept any new virtual parents here.
Your devices will automatically appear in /sys/devices/virtual/dst/, and
not below your own parent. But that path does not matter anyway, because
you should only access them from the /sys/bus/dst/devices/ directory.

And in general please don't claim generic names like storage in any
namespace for a very specific subsystem like this.

   +which contains following files:
   +
   +alg - contains name of the algorithm used to created given storage
   +name - name of the storage
   +nodes - map of the storage (list of nodes and their sizes and starts)
   +remove_all_nodes - writable file which allows to remove all nodes from 
   given
   +   storage
   +n-$start-$cookie - per node directory, where
   +   $start - start of the given node in sectors,
   +   $cookie - unique node's id used by DST
   +
   +2. Per-node files.
   +Node's files are located in /sysfs/devices/$storage_name/n-$start-$cookie
   +directory, described above.
  
  To which class or bus do the devices you create belong? Care to show a
  tree or ls -la of the device?
 
 It is 'dst' bus.
 
 uganda:~/codes# ls -la /sys/devices/staorge/
 total 0
 drwxr-xr-x 4 root root0 2007-12-10 11:46 .
 drwxr-xr-x 9 root root0 2007-12-10 11:46 ..
 -r--r--r-- 1 root root 4096 2007-12-10 11:46 alg
 lrwxrwxrwx 1 root root0 2007-12-10 11:46 bus - ../../bus/dst
 drwxr-xr-x 3 root root0 2007-12-10 11:46 n-0-81003e24117
 -r--r--r-- 1 root root 4096 2007-12-10 11:46 name
 -r--r--r-- 1 root root 4096 2007-12-10 11:46 nodes
 drwxr-xr-x 2 root root0 2007-12-10 11:46 power
 -rw-r--r-- 1 root root 4096 2007-12-10 11:46 remove_all_nodes
 lrwxrwxrwx 1 root root0 2007-12-10 11:46 subsystem - ../../bus/dst
 -rw-r--r-- 1 root root 4096 2007-12-10 11:46 uevent

Ok, how does:
  ls -l /sys/devices/storage/n-0-81003e24117
look?

 uganda:~/codes# ls -l /sys/bus/dst/
 total 0
 drwxr-xr-x 2 root root0 2007-12-10 09:52 devices
 drwxr-xr-x 2 root root0 2007-12-10 09:52 drivers
 -rw-r--r-- 1 root root 4096 2007-12-10 11:46 drivers_autoprobe
 --w--- 1 root root 4096 2007-12-10 11:46 drivers_probe

How does:
  ls -l /sys/bus/dst/devices
look?


Further questions:
Why do you do your own refcounting instead of using kref?
Why don't you use groups for the attributes?
Why don't you use default attributes for the device, where you get all
error handling done by the core.

Kay

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] netem: trace enhancement

2007-12-10 Thread Ariane Keller

I finally managed to rewrite the netem trace extension to use rtnetlink 
communication for the data transfer for user space to kernel space.


The kernel patch is available here:
http://www.tcn.hypert.net/tcn_kernel_2_6_23_rtnetlink

and the iproute patch is here:
http://www.tcn.hypert.net/tcn_iproute2_2_6_23_rtnetlink

Whenever new data is needed the kernel module sends a notification to 
the user space process. Thereupon the user space process sends a data 
package to the kernel module.
I had to write a new qdisc_notify function (qdisc_notify_pid) since the 
other was acquiring a lock, which we already hold in this situation.


I hope everything works as expected and I'm looking forward for your 
comments.


Thanks!
Ariane
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/4] DST: Distributed storage documentation.

On Mon, Dec 10, 2007 at 03:31:48PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
  I meant that for each new device, it will be placed into
  /sys/devices/its_name, but it can also be accessed via
  /sys/bus/dst/devices/
 
 Still, it looks like a path. :)
 
 Please don't reference any device directly with a /sys/devices/ path.
 You have to use the subsystem links to the devices
 in /sys/bus/dst/devices/. Devices are free to move around
 in /sys/devices, even during runtime. Yours don't do, but anyway, please
 remove all mentioning of direct access to /sys/devices/.

Ok, I will update documentation to reference /sys/bus/dst/devices
instead of /sys/devices

 Btw, where is the top-level /sys/devices/storage/ coming from? I don't
 see that in the code. We don't accept any new virtual parents here.

 Your devices will automatically appear in /sys/devices/virtual/dst/, and
 not below your own parent. But that path does not matter anyway, because
 you should only access them from the /sys/bus/dst/devices/ directory.
 
 And in general please don't claim generic names like storage in any
 namespace for a very specific subsystem like this.

It is not a parent - it is an example for device called 'storage', if it
will be called 'testing', then path will be /sys/devices/testing or more
correct /sys/bus/dst/devices/testing :)

  It is 'dst' bus.
  
  uganda:~/codes# ls -la /sys/devices/staorge/
  total 0
  drwxr-xr-x 4 root root0 2007-12-10 11:46 .
  drwxr-xr-x 9 root root0 2007-12-10 11:46 ..
  -r--r--r-- 1 root root 4096 2007-12-10 11:46 alg
  lrwxrwxrwx 1 root root0 2007-12-10 11:46 bus - ../../bus/dst
  drwxr-xr-x 3 root root0 2007-12-10 11:46 n-0-81003e24117
  -r--r--r-- 1 root root 4096 2007-12-10 11:46 name
  -r--r--r-- 1 root root 4096 2007-12-10 11:46 nodes
  drwxr-xr-x 2 root root0 2007-12-10 11:46 power
  -rw-r--r-- 1 root root 4096 2007-12-10 11:46 remove_all_nodes
  lrwxrwxrwx 1 root root0 2007-12-10 11:46 subsystem - ../../bus/dst
  -rw-r--r-- 1 root root 4096 2007-12-10 11:46 uevent
 
 Ok, how does:
   ls -l /sys/devices/storage/n-0-81003e24117
 look?

uganda:~/codes# ls -l /sys/devices/storage/n-0-81003ebc220/
total 0
drwxr-xr-x 2 root root0 2007-12-10 13:23 power
-r--r--r-- 1 root root 4096 2007-12-10 13:30 size
-r--r--r-- 1 root root 4096 2007-12-10 13:30 start
-r--r--r-- 1 root root 4096 2007-12-10 13:30 type
-rw-r--r-- 1 root root 4096 2007-12-10 13:30 uevent


  uganda:~/codes# ls -l /sys/bus/dst/
  total 0
  drwxr-xr-x 2 root root0 2007-12-10 09:52 devices
  drwxr-xr-x 2 root root0 2007-12-10 09:52 drivers
  -rw-r--r-- 1 root root 4096 2007-12-10 11:46 drivers_autoprobe
  --w--- 1 root root 4096 2007-12-10 11:46 drivers_probe
 
 How does:
   ls -l /sys/bus/dst/devices
 look?

uganda:~/codes# ls -la /sys/bus/dst/devices/
total 0
drwxr-xr-x 2 root root 0 2007-12-10 13:30 .
drwxr-xr-x 4 root root 0 2007-12-10 13:22 ..
lrwxrwxrwx 1 root root 0 2007-12-10 13:30 storage - ../../../devices/storage


Here 'storage' is just a name for device called 'storage', it can be
anything else.
 
 Further questions:
 Why do you do your own refcounting instead of using kref?

That's because I always used atomic operations as a reference counters
and did not tried krefs :)
They are the same actually (module tricky arches where smp_mb_* are
required), so I can replace them in the next release.

 Why don't you use groups for the attributes?

For 3-4 attributes it is faster to register them in a loop than typing
another structure :)

 Why don't you use default attributes for the device, where you get all
 error handling done by the core.

What is 'default attributes' and for what devices?
All my sysfs files are so much trivial, so they do not need anything
special and I do not see what is error handling you mentioned.

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: routing policy based on u32 classifier

2007-12-10 Thread Brian S Julin



Marco wrote:

 Hello everybody.
 Kindly, I would like to know if the is any plan to add this feature to a 
 future kernel release.
 I know that fwmark is able to do this, but there is the limitation in source 
 ip address selection.

Could you explain the limitation?  My iptables manpage seems to suggest
that u32 is pretty general.  Are you just asking if the pom-ng ipt_u32
will be mainlined?

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/4] DST: Distributed storage documentation.

On Mon, Dec 10, 2007 at 05:50:55PM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
  Further questions:
  Why do you do your own refcounting instead of using kref?
 
 That's because I always used atomic operations as a reference counters
 and did not tried krefs :)
 They are the same actually (module tricky arches where smp_mb_* are
 required), so I can replace them in the next release.

Actually not - I have to set reference counter to something other than 1
or +/- 1, and thus will have to call kref_get() in a loop, which is a
very ugly step. Is there kref_set() or somethinglike that? At least not
in 2.6.22 what I'm using for now.

Sigh, I've converted most of the DST already...

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 0/5] ipv6: make af_inet6 subsystems to return an error at init

This patchset continue the work to make the different af_inet6 
subsystems initialization functions to return an error code and
to handle the error to fails safely.

It takes into account:
* flowlabel
* exthdrs
* frag
* udp
* udplite
* tcp
* raw
-- 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 2/5] ipv6: make extended headers to return an error at initialization

This patch factorize the code for the differents init functions for rthdr,
nodata, destopt in a single function exthdrs_init.
This function returns an error so the af_inet6 module can check correctly
the initialization.

Signed-off-by: Daniel Lezcano [EMAIL PROTECTED]
---
 include/net/transp_v6.h |5 +--
 net/ipv6/af_inet6.c |   10 +--
 net/ipv6/exthdrs.c  |   64 +---
 3 files changed, 48 insertions(+), 31 deletions(-)

Index: net-2.6.25/include/net/transp_v6.h
===
--- net-2.6.25.orig/include/net/transp_v6.h
+++ net-2.6.25/include/net/transp_v6.h
@@ -17,10 +17,9 @@ extern struct proto tcpv6_prot;
 struct flowi;
 
 /* extention headers */
-extern voidipv6_rthdr_init(void);
+extern int ipv6_exthdrs_init(void);
+extern voidipv6_exthdrs_exit(void);
 extern voidipv6_frag_init(void);
-extern voidipv6_nodata_init(void);
-extern voidipv6_destopt_init(void);
 
 /* transport protocols */
 extern voidrawv6_init(void);
Index: net-2.6.25/net/ipv6/af_inet6.c
===
--- net-2.6.25.orig/net/ipv6/af_inet6.c
+++ net-2.6.25/net/ipv6/af_inet6.c
@@ -859,10 +859,11 @@ static int __init inet6_init(void)
goto addrconf_fail;
 
/* Init v6 extension headers. */
-   ipv6_rthdr_init();
+   err = ipv6_exthdrs_init();
+   if (err)
+   goto ipv6_exthdrs_fail;
+
ipv6_frag_init();
-   ipv6_nodata_init();
-   ipv6_destopt_init();
 
/* Init v6 transport protocols. */
udpv6_init();
@@ -874,6 +875,8 @@ static int __init inet6_init(void)
 out:
return err;
 
+ipv6_exthdrs_fail:
+   addrconf_cleanup();
 addrconf_fail:
ip6_flowlabel_cleanup();
 ip6_flowlabel_fail:
@@ -932,6 +935,7 @@ static void __exit inet6_exit(void)
/* Cleanup code parts. */
ipv6_packet_cleanup();
 
+   ipv6_exthdrs_exit();
addrconf_cleanup();
ip6_flowlabel_cleanup();
ip6_route_cleanup();
Index: net-2.6.25/net/ipv6/exthdrs.c
===
--- net-2.6.25.orig/net/ipv6/exthdrs.c
+++ net-2.6.25/net/ipv6/exthdrs.c
@@ -308,28 +308,6 @@ static int ipv6_destopt_rcv(struct sk_bu
return -1;
 }
 
-static struct inet6_protocol destopt_protocol = {
-   .handler=   ipv6_destopt_rcv,
-   .flags  =   INET6_PROTO_NOPOLICY | INET6_PROTO_GSO_EXTHDR,
-};
-
-void __init ipv6_destopt_init(void)
-{
-   if (inet6_add_protocol(destopt_protocol, IPPROTO_DSTOPTS)  0)
-   printk(KERN_ERR ipv6_destopt_init: Could not register 
protocol\n);
-}
-
-static struct inet6_protocol nodata_protocol = {
-   .handler=   dst_discard,
-   .flags  =   INET6_PROTO_NOPOLICY,
-};
-
-void __init ipv6_nodata_init(void)
-{
-   if (inet6_add_protocol(nodata_protocol, IPPROTO_NONE)  0)
-   printk(KERN_ERR ipv6_nodata_init: Could not register 
protocol\n);
-}
-
 /
   Routing header.
  /
@@ -527,12 +505,48 @@ static struct inet6_protocol rthdr_proto
.flags  =   INET6_PROTO_NOPOLICY | INET6_PROTO_GSO_EXTHDR,
 };
 
-void __init ipv6_rthdr_init(void)
+static struct inet6_protocol destopt_protocol = {
+   .handler=   ipv6_destopt_rcv,
+   .flags  =   INET6_PROTO_NOPOLICY | INET6_PROTO_GSO_EXTHDR,
+};
+
+static struct inet6_protocol nodata_protocol = {
+   .handler=   dst_discard,
+   .flags  =   INET6_PROTO_NOPOLICY,
+};
+
+int __init ipv6_exthdrs_init(void)
 {
-   if (inet6_add_protocol(rthdr_protocol, IPPROTO_ROUTING)  0)
-   printk(KERN_ERR ipv6_rthdr_init: Could not register 
protocol\n);
+   int ret;
+
+   ret = inet6_add_protocol(rthdr_protocol, IPPROTO_ROUTING);
+   if (ret)
+   goto out;
+
+   ret = inet6_add_protocol(destopt_protocol, IPPROTO_DSTOPTS);
+   if (ret)
+   goto out_rthdr;
+
+   ret = inet6_add_protocol(nodata_protocol, IPPROTO_NONE);
+   if (ret)
+   goto out_destopt;
+
+out:
+   return ret;
+out_rthdr:
+   inet6_del_protocol(rthdr_protocol, IPPROTO_ROUTING);
+out_destopt:
+   inet6_del_protocol(destopt_protocol, IPPROTO_DSTOPTS);
+   goto out;
 };
 
+void ipv6_exthdrs_exit(void)
+{
+   inet6_del_protocol(nodata_protocol, IPPROTO_NONE);
+   inet6_del_protocol(destopt_protocol, IPPROTO_DSTOPTS);
+   inet6_del_protocol(rthdr_protocol, IPPROTO_ROUTING);
+}
+
 /**
   Hop-by-hop options.
  **/

-- 
--
To unsubscribe from this list:

[patch 5/5] ipv6: make the protocol initialization to return an error code

This patchset makes the different protocols to return an error code, so
the af_inet6 module can check the initialization was correct or not.

The raw6 was taken into account to be consistent with the rest of the
protocols, but the registration is at the same place.
Because the raw6 has its own init function, the proto and the ops structure
can be moved inside the raw6.c file.

Signed-off-by: Daniel Lezcano [EMAIL PROTECTED]
---
 include/net/ipv6.h   |2 -
 include/net/transp_v6.h  |   12 ---
 net/ipv6/af_inet6.c  |   77 ---
 net/ipv6/ipv6_sockglue.c |3 +
 net/ipv6/raw.c   |   52 +++
 net/ipv6/tcp_ipv6.c  |   38 ++-
 net/ipv6/udp.c   |   26 +--
 net/ipv6/udplite.c   |   25 ---
 8 files changed, 169 insertions(+), 66 deletions(-)

Index: net-2.6.25/include/net/transp_v6.h
===
--- net-2.6.25.orig/include/net/transp_v6.h
+++ net-2.6.25/include/net/transp_v6.h
@@ -23,10 +23,14 @@ extern int  ipv6_frag_init(void);
 extern voidipv6_frag_exit(void);
 
 /* transport protocols */
-extern voidrawv6_init(void);
-extern voidudpv6_init(void);
-extern voidudplitev6_init(void);
-extern voidtcpv6_init(void);
+extern int rawv6_init(void);
+extern voidrawv6_exit(void);
+extern int udpv6_init(void);
+extern voidudpv6_exit(void);
+extern int udplitev6_init(void);
+extern voidudplitev6_exit(void);
+extern int tcpv6_init(void);
+extern voidtcpv6_exit(void);
 
 extern int udpv6_connect(struct sock *sk,
  struct sockaddr *uaddr,
Index: net-2.6.25/net/ipv6/af_inet6.c
===
--- net-2.6.25.orig/net/ipv6/af_inet6.c
+++ net-2.6.25/net/ipv6/af_inet6.c
@@ -529,42 +529,6 @@ static struct net_proto_family inet6_fam
.owner  = THIS_MODULE,
 };
 
-/* Same as inet6_dgram_ops, sans udp_poll.  */
-static const struct proto_ops inet6_sockraw_ops = {
-   .family= PF_INET6,
-   .owner = THIS_MODULE,
-   .release   = inet6_release,
-   .bind  = inet6_bind,
-   .connect   = inet_dgram_connect,/* ok   */
-   .socketpair= sock_no_socketpair,/* a do nothing */
-   .accept= sock_no_accept,/* a do nothing */
-   .getname   = inet6_getname,
-   .poll  = datagram_poll, /* ok   */
-   .ioctl = inet6_ioctl,   /* must change  */
-   .listen= sock_no_listen,/* ok   */
-   .shutdown  = inet_shutdown, /* ok   */
-   .setsockopt= sock_common_setsockopt,/* ok   */
-   .getsockopt= sock_common_getsockopt,/* ok   */
-   .sendmsg   = inet_sendmsg,  /* ok   */
-   .recvmsg   = sock_common_recvmsg,   /* ok   */
-   .mmap  = sock_no_mmap,
-   .sendpage  = sock_no_sendpage,
-#ifdef CONFIG_COMPAT
-   .compat_setsockopt = compat_sock_common_setsockopt,
-   .compat_getsockopt = compat_sock_common_getsockopt,
-#endif
-};
-
-static struct inet_protosw rawv6_protosw = {
-   .type   = SOCK_RAW,
-   .protocol   = IPPROTO_IP,   /* wild card */
-   .prot   = rawv6_prot,
-   .ops= inet6_sockraw_ops,
-   .capability = CAP_NET_RAW,
-   .no_check   = UDP_CSUM_DEFAULT,
-   .flags  = INET_PROTOSW_REUSE,
-};
-
 int inet6_register_protosw(struct inet_protosw *p)
 {
struct list_head *lh;
@@ -771,7 +735,6 @@ static int __init inet6_init(void)
__this_module.can_unload = ipv6_unload;
 #endif
 #endif
-
err = proto_register(tcpv6_prot, 1);
if (err)
goto out;
@@ -796,14 +759,16 @@ static int __init inet6_init(void)
/* We MUST register RAW sockets before we create the ICMP6,
 * IGMP6, or NDISC control sockets.
 */
-   inet6_register_protosw(rawv6_protosw);
+   err = rawv6_init();
+   if (err)
+   goto out_unregister_raw_proto;
 
/* Register the family here so that the init calls below will
 * be able to create sockets. (?? is this dangerous ??)
 */
err = sock_register(inet6_family_ops);
if (err)
-   goto

[patch 1/5] ipv6: make flowlabel to return an error

This patch makes the flowlab subsystem to return an error code and makes
some cleanup with procfs ifdefs.
The af_inet6 will use the flowlabel init return code to check the initialization
was correct.

Signed-off-by: Daniel Lezcano [EMAIL PROTECTED]
---
 include/net/ipv6.h   |2 +-
 net/ipv6/af_inet6.c  |5 -
 net/ipv6/ip6_flowlabel.c |   30 +++---
 3 files changed, 28 insertions(+), 9 deletions(-)

Index: net-2.6.25/include/net/ipv6.h
===
--- net-2.6.25.orig/include/net/ipv6.h
+++ net-2.6.25/include/net/ipv6.h
@@ -219,7 +219,7 @@ extern struct ipv6_txoptions*fl6_merge_
   struct ipv6_txoptions * 
fopt);
 extern voidfl6_free_socklist(struct sock *sk);
 extern int ipv6_flowlabel_opt(struct sock *sk, char __user 
*optval, int optlen);
-extern voidip6_flowlabel_init(void);
+extern int ip6_flowlabel_init(void);
 extern voidip6_flowlabel_cleanup(void);
 
 static inline void fl6_sock_release(struct ip6_flowlabel *fl)
Index: net-2.6.25/net/ipv6/af_inet6.c
===
--- net-2.6.25.orig/net/ipv6/af_inet6.c
+++ net-2.6.25/net/ipv6/af_inet6.c
@@ -851,7 +851,9 @@ static int __init inet6_init(void)
err = ip6_route_init();
if (err)
goto ip6_route_fail;
-   ip6_flowlabel_init();
+   err = ip6_flowlabel_init();
+   if (err)
+   goto ip6_flowlabel_fail;
err = addrconf_init();
if (err)
goto addrconf_fail;
@@ -874,6 +876,7 @@ out:
 
 addrconf_fail:
ip6_flowlabel_cleanup();
+ip6_flowlabel_fail:
ip6_route_cleanup();
 ip6_route_fail:
 #ifdef CONFIG_PROC_FS
Index: net-2.6.25/net/ipv6/ip6_flowlabel.c
===
--- net-2.6.25.orig/net/ipv6/ip6_flowlabel.c
+++ net-2.6.25/net/ipv6/ip6_flowlabel.c
@@ -692,20 +692,36 @@ static const struct file_operations ip6f
.llseek =   seq_lseek,
.release=   seq_release_private,
 };
-#endif
 
+static int ip6_flowlabel_proc_init(struct net *net)
+{
+   if (!proc_net_fops_create(net, ip6_flowlabel, S_IRUGO, 
ip6fl_seq_fops))
+   return -ENOMEM;
+   return 0;
+}
 
-void ip6_flowlabel_init(void)
+static void ip6_flowlabel_proc_fini(struct net *net)
 {
-#ifdef CONFIG_PROC_FS
-   proc_net_fops_create(init_net, ip6_flowlabel, S_IRUGO, 
ip6fl_seq_fops);
+   proc_net_remove(net, ip6_flowlabel);
+}
+#else
+static inline int ip6_flowlabel_proc_init(struct net *net)
+{
+   return 0;
+}
+static inline void ip6_flowlabel_proc_fini(struct net *net)
+{
+   return ;
+}
 #endif
+
+int ip6_flowlabel_init(void)
+{
+   return ip6_flowlabel_proc_init(init_net);
 }
 
 void ip6_flowlabel_cleanup(void)
 {
del_timer(ip6_fl_gc_timer);
-#ifdef CONFIG_PROC_FS
-   proc_net_remove(init_net, ip6_flowlabel);
-#endif
+   ip6_flowlabel_proc_fini(init_net);
 }

-- 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 3/5] ipv6: make frag to return an error at initialization

This patch makes the frag_init to return an error code, so the af_inet6
module can handle the error.

Signed-off-by: Daniel Lezcano [EMAIL PROTECTED]
---
 include/net/transp_v6.h |3 ++-
 net/ipv6/af_inet6.c |8 ++--
 net/ipv6/reassembly.c   |   16 +---
 3 files changed, 21 insertions(+), 6 deletions(-)

Index: net-2.6.25/include/net/transp_v6.h
===
--- net-2.6.25.orig/include/net/transp_v6.h
+++ net-2.6.25/include/net/transp_v6.h
@@ -19,7 +19,8 @@ struct flowi;
 /* extention headers */
 extern int ipv6_exthdrs_init(void);
 extern voidipv6_exthdrs_exit(void);
-extern voidipv6_frag_init(void);
+extern int ipv6_frag_init(void);
+extern voidipv6_frag_exit(void);
 
 /* transport protocols */
 extern voidrawv6_init(void);
Index: net-2.6.25/net/ipv6/af_inet6.c
===
--- net-2.6.25.orig/net/ipv6/af_inet6.c
+++ net-2.6.25/net/ipv6/af_inet6.c
@@ -863,7 +863,9 @@ static int __init inet6_init(void)
if (err)
goto ipv6_exthdrs_fail;
 
-   ipv6_frag_init();
+   err = ipv6_frag_init();
+   if (err)
+   goto ipv6_frag_fail;
 
/* Init v6 transport protocols. */
udpv6_init();
@@ -875,6 +877,8 @@ static int __init inet6_init(void)
 out:
return err;
 
+ipv6_frag_fail:
+   ipv6_exthdrs_exit();
 ipv6_exthdrs_fail:
addrconf_cleanup();
 addrconf_fail:
@@ -934,7 +938,7 @@ static void __exit inet6_exit(void)
 
/* Cleanup code parts. */
ipv6_packet_cleanup();
-
+   ipv6_frag_exit();
ipv6_exthdrs_exit();
addrconf_cleanup();
ip6_flowlabel_cleanup();
Index: net-2.6.25/net/ipv6/reassembly.c
===
--- net-2.6.25.orig/net/ipv6/reassembly.c
+++ net-2.6.25/net/ipv6/reassembly.c
@@ -632,11 +632,13 @@ static struct inet6_protocol frag_protoc
.flags  =   INET6_PROTO_NOPOLICY,
 };
 
-void __init ipv6_frag_init(void)
+int __init ipv6_frag_init(void)
 {
-   if (inet6_add_protocol(frag_protocol, IPPROTO_FRAGMENT)  0)
-   printk(KERN_ERR ipv6_frag_init: Could not register 
protocol\n);
+   int ret;
 
+   ret = inet6_add_protocol(frag_protocol, IPPROTO_FRAGMENT);
+   if (ret)
+   goto out;
ip6_frags.ctl = ip6_frags_ctl;
ip6_frags.hashfn = ip6_hashfn;
ip6_frags.constructor = ip6_frag_init;
@@ -646,4 +648,12 @@ void __init ipv6_frag_init(void)
ip6_frags.match = ip6_frag_match;
ip6_frags.frag_expire = ip6_frag_expire;
inet_frags_init(ip6_frags);
+out:
+   return ret;
+}
+
+void ipv6_frag_exit(void)
+{
+   inet_frags_fini(ip6_frags);
+   inet6_del_protocol(frag_protocol, IPPROTO_FRAGMENT);
 }

-- 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 4/5] ipv6: make inet6_register_protosw to return an error code

This patch makes the inet6_register_protosw to return an error code.
The different protocols can be aware the registration was successful or
not and can pass the error to the initial caller, af_inet6.

Signed-off-by: Daniel Lezcano [EMAIL PROTECTED]
---
 include/net/protocol.h |2 +-
 net/ipv6/af_inet6.c|   11 +++
 2 files changed, 8 insertions(+), 5 deletions(-)

Index: net-2.6.25/include/net/protocol.h
===
--- net-2.6.25.orig/include/net/protocol.h
+++ net-2.6.25/include/net/protocol.h
@@ -102,7 +102,7 @@ extern void inet_unregister_protosw(stru
 #if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE)
 extern int inet6_add_protocol(struct inet6_protocol *prot, unsigned char 
num);
 extern int inet6_del_protocol(struct inet6_protocol *prot, unsigned char 
num);
-extern voidinet6_register_protosw(struct inet_protosw *p);
+extern int inet6_register_protosw(struct inet_protosw *p);
 extern voidinet6_unregister_protosw(struct inet_protosw *p);
 #endif
 
Index: net-2.6.25/net/ipv6/af_inet6.c
===
--- net-2.6.25.orig/net/ipv6/af_inet6.c
+++ net-2.6.25/net/ipv6/af_inet6.c
@@ -565,21 +565,23 @@ static struct inet_protosw rawv6_protosw
.flags  = INET_PROTOSW_REUSE,
 };
 
-void
-inet6_register_protosw(struct inet_protosw *p)
+int inet6_register_protosw(struct inet_protosw *p)
 {
struct list_head *lh;
struct inet_protosw *answer;
-   int protocol = p-protocol;
struct list_head *last_perm;
+   int protocol = p-protocol;
+   int ret;
 
spin_lock_bh(inetsw6_lock);
 
+   ret = -EINVAL;
if (p-type = SOCK_MAX)
goto out_illegal;
 
/* If we are trying to override a permanent protocol, bail. */
answer = NULL;
+   ret = -EPERM;
last_perm = inetsw6[p-type];
list_for_each(lh, inetsw6[p-type]) {
answer = list_entry(lh, struct inet_protosw, list);
@@ -603,9 +605,10 @@ inet6_register_protosw(struct inet_proto
 * system automatically returns to the old behavior.
 */
list_add_rcu(p-list, last_perm);
+   ret = 0;
 out:
spin_unlock_bh(inetsw6_lock);
-   return;
+   return ret;
 
 out_permanent:
printk(KERN_ERR Attempt to override permanent protocol %d.\n,

-- 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: routing policy based on u32 classifier

2007-12-10 Thread Marco Berizzi

Brian S Julin wrote:

Marco wrote:

  Hello everybody.
  Kindly, I would like to know if the is any plan to add this feature
to a   future kernel release.
  I know that fwmark is able to do this, but there is the limitation
in
  source ip address selection.

 Could you explain the limitation?

Indeed.
Here is an example:

hdsl|  |adsl
line|  |line
|  |
  +--+
  |  |
  |SQUID |
  |  |
  +--+
  |
  internal|
lan   |


The linux default gateway point to the hdsl
router. I want to redirect the squid http
traffic (running on the same host) to the adsl
line. So I create a routing table (adsl table)
and I put a default route to the adsl router.
I mark the http traffic. I insert a rule
based on fwmark which select the 'adsl table'.
So far, so good. But, squid will select the
source ip address from the hdsl network class,
because it is the default gateway.
Then, the local generated packets must be SNATed
with the adsl ip.
Is it clear?


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 0/5] ipv6: make af_inet6 subsystems to return an error at init


Daniel Lezcano wrote:
This patchset continue the work to make the different af_inet6 
subsystems initialization functions to return an error code and

to handle the error to fails safely.

It takes into account:
* flowlabel
* exthdrs
* frag
* udp
* udplite
* tcp
* raw


I just noticed that I forgot to put ipv6 under bracket.
Sorry for that :(

Should I resend the patchset ?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: routing policy based on u32 classifier

2007-12-10 Thread Marco Berizzi

Brian S Julin wrote:

 Almost clear... why can you not just add src ADSL IP to
 the fwmark route to set the default source address for locally
 originating packets?

IIRC, it doesn't work because netfilter isn't called
in ip source address selection.


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2.6.25] netns: struct net content re-work

2007-12-10 Thread Denis V. Lunev

Recently David Miller and Herbert Xu pointed out that struct net becomes
overbloated and un-maintainable. There are two solutions:
- provide a pointer to a network subsystem definition from struct net.
  This costs an additional dereferrence
- place sub-system definition into the structure itself. This will speedup
  run-time access at the cost of recompilation time

The second approach looks better for us. Other sub-systems will be converted
to this approach if this will be accepted :)

Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index b62e31f..f60e1ce 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -8,6 +8,8 @@
 #include linux/workqueue.h
 #include linux/list.h
 
+#include net/netns/unix.h
+
 struct proc_dir_entry;
 struct net_device;
 struct sock;
@@ -46,8 +48,7 @@ struct net {
struct hlist_head   packet_sklist;
 
/* unix sockets */
-   int sysctl_unix_max_dgram_qlen;
-   struct ctl_table_header *unix_ctl;
+   struct netns_unix   unx;
 };
 
 #ifdef CONFIG_NET
diff --git a/include/net/netns/unix.h b/include/net/netns/unix.h
new file mode 100644
index 000..27b4e7f
--- /dev/null
+++ b/include/net/netns/unix.h
@@ -0,0 +1,13 @@
+/*
+ * Unix network namespace
+ */
+#ifndef __NETNS_UNIX_H__
+#define __NETNS_UNIX_H__
+
+struct ctl_table_header;
+struct netns_unix {
+   int sysctl_unix_max_dgram_qlen;
+   struct ctl_table_header *unix_ctl;
+};
+
+#endif /* __NETNS_UNIX_H__ */
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index b8a2189..06f7ec8 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -592,7 +592,7 @@ static struct sock * unix_create1(struct net *net, struct 
socket *sock)
af_unix_sk_receive_queue_lock_key);
 
sk-sk_write_space  = unix_write_space;
-   sk-sk_max_ack_backlog  = net-sysctl_unix_max_dgram_qlen;
+   sk-sk_max_ack_backlog  = net-unx.sysctl_unix_max_dgram_qlen;
sk-sk_destruct = unix_sock_destructor;
u = unix_sk(sk);
u-dentry = NULL;
@@ -2138,7 +2138,7 @@ static int unix_net_init(struct net *net)
 {
int error = -ENOMEM;
 
-   net-sysctl_unix_max_dgram_qlen = 10;
+   net-unx.sysctl_unix_max_dgram_qlen = 10;
if (unix_sysctl_register(net))
goto out;
 
diff --git a/net/unix/sysctl_net_unix.c b/net/unix/sysctl_net_unix.c
index 553ef6a..fbddfb5 100644
--- a/net/unix/sysctl_net_unix.c
+++ b/net/unix/sysctl_net_unix.c
@@ -18,7 +18,7 @@ static ctl_table unix_table[] = {
{
.ctl_name   = NET_UNIX_MAX_DGRAM_QLEN,
.procname   = max_dgram_qlen,
-   .data   = init_net.sysctl_unix_max_dgram_qlen,
+   .data   = init_net.unx.sysctl_unix_max_dgram_qlen,
.maxlen = sizeof(int),
.mode   = 0644,
.proc_handler   = proc_dointvec
@@ -40,9 +40,9 @@ int unix_sysctl_register(struct net *net)
if (table == NULL)
goto err_alloc;
 
-   table[0].data = net-sysctl_unix_max_dgram_qlen;
-   net-unix_ctl = register_net_sysctl_table(net, unix_path, table);
-   if (net-unix_ctl == NULL)
+   table[0].data = net-unx.sysctl_unix_max_dgram_qlen;
+   net-unx.unix_ctl = register_net_sysctl_table(net, unix_path, table);
+   if (net-unx.unix_ctl == NULL)
goto err_reg;
 
return 0;
@@ -57,8 +57,8 @@ void unix_sysctl_unregister(struct net *net)
 {
struct ctl_table *table;
 
-   table = net-unix_ctl-ctl_table_arg;
-   unregister_sysctl_table(net-unix_ctl);
+   table = net-unx.unix_ctl-ctl_table_arg;
+   unregister_sysctl_table(net-unx.unix_ctl);
kfree(table);
 }
 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2.6.25] UNIX: remove unused declaration of sysctl_unix_max_dgram_qlen

2007-12-10 Thread Denis V. Lunev

Recently David Miller and Herbert Xu pointed out that struct net becomes
overbloated and un-maintainable. There are two solutions:
- provide a pointer to a network subsystem definition from struct net.
  This costs an additional dereferrence
- place sub-system definition into the structure itself. This will speedup
  run-time access at the cost of recompilation time

The second approach looks better for us. Other sub-systems will be converted
to this approach if this will be accepted :)

Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index b62e31f..f60e1ce 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -8,6 +8,8 @@
 #include linux/workqueue.h
 #include linux/list.h
 
+#include net/netns/unix.h
+
 struct proc_dir_entry;
 struct net_device;
 struct sock;
@@ -46,8 +48,7 @@ struct net {
struct hlist_head   packet_sklist;
 
/* unix sockets */
-   int sysctl_unix_max_dgram_qlen;
-   struct ctl_table_header *unix_ctl;
+   struct netns_unix   unx;
 };
 
 #ifdef CONFIG_NET
diff --git a/include/net/netns/unix.h b/include/net/netns/unix.h
new file mode 100644
index 000..27b4e7f
--- /dev/null
+++ b/include/net/netns/unix.h
@@ -0,0 +1,13 @@
+/*
+ * Unix network namespace
+ */
+#ifndef __NETNS_UNIX_H__
+#define __NETNS_UNIX_H__
+
+struct ctl_table_header;
+struct netns_unix {
+   int sysctl_unix_max_dgram_qlen;
+   struct ctl_table_header *unix_ctl;
+};
+
+#endif /* __NETNS_UNIX_H__ */
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index b8a2189..06f7ec8 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -592,7 +592,7 @@ static struct sock * unix_create1(struct net *net, struct 
socket *sock)
af_unix_sk_receive_queue_lock_key);
 
sk-sk_write_space  = unix_write_space;
-   sk-sk_max_ack_backlog  = net-sysctl_unix_max_dgram_qlen;
+   sk-sk_max_ack_backlog  = net-unx.sysctl_unix_max_dgram_qlen;
sk-sk_destruct = unix_sock_destructor;
u = unix_sk(sk);
u-dentry = NULL;
@@ -2138,7 +2138,7 @@ static int unix_net_init(struct net *net)
 {
int error = -ENOMEM;
 
-   net-sysctl_unix_max_dgram_qlen = 10;
+   net-unx.sysctl_unix_max_dgram_qlen = 10;
if (unix_sysctl_register(net))
goto out;
 
diff --git a/net/unix/sysctl_net_unix.c b/net/unix/sysctl_net_unix.c
index 553ef6a..fbddfb5 100644
--- a/net/unix/sysctl_net_unix.c
+++ b/net/unix/sysctl_net_unix.c
@@ -18,7 +18,7 @@ static ctl_table unix_table[] = {
{
.ctl_name   = NET_UNIX_MAX_DGRAM_QLEN,
.procname   = max_dgram_qlen,
-   .data   = init_net.sysctl_unix_max_dgram_qlen,
+   .data   = init_net.unx.sysctl_unix_max_dgram_qlen,
.maxlen = sizeof(int),
.mode   = 0644,
.proc_handler   = proc_dointvec
@@ -40,9 +40,9 @@ int unix_sysctl_register(struct net *net)
if (table == NULL)
goto err_alloc;
 
-   table[0].data = net-sysctl_unix_max_dgram_qlen;
-   net-unix_ctl = register_net_sysctl_table(net, unix_path, table);
-   if (net-unix_ctl == NULL)
+   table[0].data = net-unx.sysctl_unix_max_dgram_qlen;
+   net-unx.unix_ctl = register_net_sysctl_table(net, unix_path, table);
+   if (net-unx.unix_ctl == NULL)
goto err_reg;
 
return 0;
@@ -57,8 +57,8 @@ void unix_sysctl_unregister(struct net *net)
 {
struct ctl_table *table;
 
-   table = net-unix_ctl-ctl_table_arg;
-   unregister_sysctl_table(net-unix_ctl);
+   table = net-unx.unix_ctl-ctl_table_arg;
+   unregister_sysctl_table(net-unx.unix_ctl);
kfree(table);
 }
 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

ip neigh show not showing arp cache entries?

2007-12-10 Thread Chris Friesen



I'm seeing some strange behaviour on a 2.6.14 ppc64 system.  If I run 
ip neigh show it prints out nothing, but if I run arp then I see the 
other nodes on the local network.



[EMAIL PROTECTED]:/root ip neigh show
[EMAIL PROTECTED]:/root arp -n
Address  HWtype  HWaddress   Flags Mask 
   Iface
172.24.132.0 ether   00:01:AF:14:E8:DA   C 
   bond0
172.24.132.1 (incomplete) 
   bond0
172.24.136.0 ether   00:C0:8B:07:B3:7E   C 
   bond0
172.24.132.4 ether   00:01:AF:14:E8:DA   C 
   bond0
172.24.132.2 ether   00:01:AF:14:E8:DA   C 
   bond0



Any ideas what's going on here?

Thanks,

Chris
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.25] netns: struct net content re-work


Denis V. Lunev wrote:

Recently David Miller and Herbert Xu pointed out that struct net becomes
overbloated and un-maintainable. There are two solutions:
- provide a pointer to a network subsystem definition from struct net.
  This costs an additional dereferrence
- place sub-system definition into the structure itself. This will speedup
  run-time access at the cost of recompilation time

The second approach looks better for us. 


Yes, we do not need/want a pointer in this structure and add more 
dereference in the network code.



Other sub-systems will be converted
to this approach if this will be accepted :)

Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index b62e31f..f60e1ce 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -8,6 +8,8 @@
 #include linux/workqueue.h
 #include linux/list.h

+#include net/netns/unix.h
+
 struct proc_dir_entry;
 struct net_device;
 struct sock;
@@ -46,8 +48,7 @@ struct net {
struct hlist_head   packet_sklist;

/* unix sockets */
-   int sysctl_unix_max_dgram_qlen;
-   struct ctl_table_header *unix_ctl;
+   struct netns_unix   unx;


Can you change this from unx to unix ?

If you encapsulate the structure definitions per subsystem, you can drop 
the unix prefix in the variable declaration.


Instead of having:
netns-unix-unix_ctl
you will have:
netns-unix-ctl


 };

 #ifdef CONFIG_NET
diff --git a/include/net/netns/unix.h b/include/net/netns/unix.h
new file mode 100644
index 000..27b4e7f
--- /dev/null
+++ b/include/net/netns/unix.h
@@ -0,0 +1,13 @@
+/*
+ * Unix network namespace
+ */
+#ifndef __NETNS_UNIX_H__
+#define __NETNS_UNIX_H__
+
+struct ctl_table_header;
+struct netns_unix {
+   int sysctl_unix_max_dgram_qlen;
+   struct ctl_table_header *unix_ctl;
+};
+
+#endif /* __NETNS_UNIX_H__ */


If I follow the logic, we will have a file per subsystem. These files 
will be very small, no ?

IMHO, having the structure defined in net_namespace.h is enough.


diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index b8a2189..06f7ec8 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -592,7 +592,7 @@ static struct sock * unix_create1(struct net *net, struct 
socket *sock)
af_unix_sk_receive_queue_lock_key);

sk-sk_write_space   = unix_write_space;
-   sk-sk_max_ack_backlog   = net-sysctl_unix_max_dgram_qlen;
+   sk-sk_max_ack_backlog   = net-unx.sysctl_unix_max_dgram_qlen;
sk-sk_destruct  = unix_sock_destructor;
u = unix_sk(sk);
u-dentry = NULL;
@@ -2138,7 +2138,7 @@ static int unix_net_init(struct net *net)
 {
int error = -ENOMEM;

-   net-sysctl_unix_max_dgram_qlen = 10;
+   net-unx.sysctl_unix_max_dgram_qlen = 10;
if (unix_sysctl_register(net))
goto out;

diff --git a/net/unix/sysctl_net_unix.c b/net/unix/sysctl_net_unix.c
index 553ef6a..fbddfb5 100644
--- a/net/unix/sysctl_net_unix.c
+++ b/net/unix/sysctl_net_unix.c
@@ -18,7 +18,7 @@ static ctl_table unix_table[] = {
{
.ctl_name   = NET_UNIX_MAX_DGRAM_QLEN,
.procname   = max_dgram_qlen,
-   .data   = init_net.sysctl_unix_max_dgram_qlen,
+   .data   = init_net.unx.sysctl_unix_max_dgram_qlen,
.maxlen = sizeof(int),
.mode   = 0644,
.proc_handler   = proc_dointvec
@@ -40,9 +40,9 @@ int unix_sysctl_register(struct net *net)
if (table == NULL)
goto err_alloc;

-   table[0].data = net-sysctl_unix_max_dgram_qlen;
-   net-unix_ctl = register_net_sysctl_table(net, unix_path, table);
-   if (net-unix_ctl == NULL)
+   table[0].data = net-unx.sysctl_unix_max_dgram_qlen;
+   net-unx.unix_ctl = register_net_sysctl_table(net, unix_path, table);
+   if (net-unx.unix_ctl == NULL)
goto err_reg;

return 0;
@@ -57,8 +57,8 @@ void unix_sysctl_unregister(struct net *net)
 {
struct ctl_table *table;

-   table = net-unix_ctl-ctl_table_arg;
-   unregister_sysctl_table(net-unix_ctl);
+   table = net-unx.unix_ctl-ctl_table_arg;
+   unregister_sysctl_table(net-unx.unix_ctl);
kfree(table);
 }

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

htb and UDP packages bigger than 1500

2007-12-10 Thread Arkadiusz Miskiewicz

Hello,

I noticed that HTB doesn't properly limit traffic if someone sends UDP 
packages bigger than 1500. 

Does HTB have some problems/known limits in this area?

There is other traffic in that class and when I drop udp packets bigger than 
1500 then remaining traffic is limited properly to correct value.

udp part of tcpdump log:
17:35:24.041364 IP 172.16.4.23.3185  80.51.230.58.9267: UDP, length 35
17:35:24.048834 IP 172.16.4.23.46073  83.53.197.205.50882: UDP, length 35
17:35:24.071234 IP 172.16.4.23.2904  81.38.28.69.26349: UDP, length 35
17:35:24.076924 IP 201.232.209.115.50750  172.16.4.23.57590: UDP, length 8251
17:35:24.140895 IP 90.242.120.252.11982  172.16.4.23.19111: UDP, length 35
17:35:24.140976 IP 79.186.64.121.47130  172.16.4.23.19111: UDP, length 35
17:35:24.141039 IP 24.150.182.77.55984  172.16.4.23.41279: UDP, length 29
17:35:24.141120 IP 83.37.212.153.25153  172.16.4.23.32824: UDP, length 8251
17:35:24.154874 IP 172.16.4.23.19111  90.242.120.252.11982: UDP, length 8251
17:35:24.210940 IP 83.8.18.150.28955  172.16.4.23.24825: UDP, length 8251
17:35:24.240382 IP 172.16.4.23.19111  79.186.64.121.47130: UDP, length 8251
17:35:24.272529 IP 83.19.224.219.32977  172.16.4.23.19111: UDP, length 35
17:35:24.307164 IP 85.219.10.150.18601  172.16.4.23.51986: UDP, length 8251
17:35:24.312335 IP 83.26.249.97.10137  172.16.4.23.9383: UDP, length 8251
17:35:24.404250 IP 83.19.224.11.1833  172.16.4.23.21258: UDP, length 8251
17:35:24.467562 IP 196.206.89.182.58764  172.16.4.23.19111: UDP, length 35
17:35:24.560058 IP 172.16.4.23.50417  82.5.204.164.1024: UDP, length 25
17:35:24.563842 IP 172.16.4.23.24825  83.8.18.150.28955: UDP, length 35
17:35:24.567316 IP 172.16.4.23.59727  195.60.65.36.61323: UDP, length 35
17:35:24.569976 IP 83.11.67.228.31949  172.16.4.23.56823: UDP, length 8251
17:35:24.617104 IP 172.16.4.23.28945  76.11.24.115.13887: UDP, length 29
17:35:24.619235 IP 172.16.4.23.21258  83.19.224.11.1833: UDP, length 35
17:35:24.626488 IP 172.16.4.23.9383  83.26.249.97.10137: UDP, length 35
17:35:24.640367 IP 172.16.4.23.47366  91.146.230.1.38928: UDP, length 25
17:35:24.644314 IP 90.242.120.252.1955  172.16.4.23.19111: UDP, length 35
17:35:24.652024 IP 81.184.124.145.22454  172.16.4.23.57089: UDP, length 8251

-- 
Arkadiusz MiśkiewiczPLD/Linux Team
arekm / maven.plhttp://ftp.pld-linux.org/
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.25] netns: struct net content re-work

2007-12-10 Thread Kirill Korotaev

Daniel Lezcano wrote:
 Denis V. Lunev wrote:
 
Recently David Miller and Herbert Xu pointed out that struct net becomes
overbloated and un-maintainable. There are two solutions:
- provide a pointer to a network subsystem definition from struct net.
  This costs an additional dereferrence
- place sub-system definition into the structure itself. This will speedup
  run-time access at the cost of recompilation time

The second approach looks better for us. 
 
 
 Yes, we do not need/want a pointer in this structure and add more 
 dereference in the network code.
 
 
Other sub-systems will be converted
to this approach if this will be accepted :)

Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index b62e31f..f60e1ce 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -8,6 +8,8 @@
 #include linux/workqueue.h
 #include linux/list.h

+#include net/netns/unix.h
+
 struct proc_dir_entry;
 struct net_device;
 struct sock;
@@ -46,8 +48,7 @@ struct net {
  struct hlist_head   packet_sklist;

  /* unix sockets */
- int sysctl_unix_max_dgram_qlen;
- struct ctl_table_header *unix_ctl;
+ struct netns_unix   unx;
 
 
 Can you change this from unx to unix ?

no, it won't compile. Guess why :)

 If you encapsulate the structure definitions per subsystem, you can drop 
 the unix prefix in the variable declaration.
 
 Instead of having:
   netns-unix-unix_ctl
 you will have:
   netns-unix-ctl

agree.

Kirill
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ip neigh show not showing arp cache entries?

2007-12-10 Thread Chris Friesen


Chris Friesen wrote:


I'm seeing some strange behaviour on a 2.6.14 ppc64 system.  If I run 
ip neigh show it prints out nothing, but if I run arp then I see the 
other nodes on the local network.



[EMAIL PROTECTED]:/root ip neigh show
[EMAIL PROTECTED]:/root arp -n
Address  HWtype  HWaddress   Flags MaskIface
172.24.132.0 ether   00:01:AF:14:E8:DA   Cbond0
172.24.132.1 (incomplete)bond0
172.24.136.0 ether   00:C0:8B:07:B3:7E   Cbond0
172.24.132.4 ether   00:01:AF:14:E8:DA   Cbond0
172.24.132.2 ether   00:01:AF:14:E8:DA   Cbond0


Any ideas what's going on here?


I've got some further information.  If I look for a specific address, it 
seems to work:


[EMAIL PROTECTED]:/root ip neigh show 172.24.136.0
172.24.136.0 dev bond0 lladdr 00:c0:8b:07:b3:7e REACHABLE


In the above scenario, the arp cache lists the device as reachable via 
bond0.  If I search the arp cache to see whether the address is 
reachable from one of bond0's slave devices, should it come back 
positive or negative?


Chris
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] IPv6 support for NFS server

2007-12-10 Thread Aurélien Charbon



Here is a cleanup for the ip_map caching patch in nfs server.

It prepares for IPv6 text-based mounts and exports.

Tests: tested with only IPv4 network and basic nfs ops (mount, file 
creation and modification)


-

Signed-off-by: Aurelien Charbon [EMAIL PROTECTED]

diff -p -u -r -N linux-2.6.24-rc4/fs/nfsd/export.c 
linux-2.6.24-rc4-IPv6-cache-based/fs/nfsd/export.c

--- linux-2.6.24-rc4/fs/nfsd/export.c2007-12-10 16:11:37.0 +0100
+++ linux-2.6.24-rc4-IPv6-cache-based/fs/nfsd/export.c2007-12-10 
17:50:37.0 +0100

@@ -35,6 +35,7 @@
#include linux/lockd/bind.h
#include linux/sunrpc/msg_prot.h
#include linux/sunrpc/gss_api.h
+#include net/ipv6.h

#define NFSDDBG_FACILITYNFSDDBG_EXPORT

@@ -1556,6 +1557,7 @@ exp_addclient(struct nfsctl_client *ncp)
{
struct auth_domain*dom;
inti, err;
+struct in6_addr addr6;

/* First, consistency check. */
err = -EINVAL;
@@ -1574,9 +1576,12 @@ exp_addclient(struct nfsctl_client *ncp)
goto out_unlock;

/* Insert client into hashtable. */
-for (i = 0; i  ncp-cl_naddr; i++)
-auth_unix_add_addr(ncp-cl_addrlist[i], dom);
-
+for (i = 0; i  ncp-cl_naddr; i++) {
+/* Mapping address */
+ipv6_addr_set(addr6, 0, 0,
+htonl(0x), ncp-cl_addrlist[i].s_addr);
+auth_unix_add_addr(addr6, dom);
+}
auth_unix_forget_old(dom);
auth_domain_put(dom);

diff -p -u -r -N linux-2.6.24-rc4/fs/nfsd/nfsctl.c 
linux-2.6.24-rc4-IPv6-cache-based/fs/nfsd/nfsctl.c

--- linux-2.6.24-rc4/fs/nfsd/nfsctl.c2007-12-10 16:11:37.0 +0100
+++ linux-2.6.24-rc4-IPv6-cache-based/fs/nfsd/nfsctl.c2007-12-10 
18:15:22.0 +0100

@@ -37,6 +37,7 @@
#include linux/nfsd/syscall.h

#include asm/uaccess.h
+#include net/ipv6.h

/*
 *We have a single directory with 9 nodes in it.
@@ -222,6 +223,7 @@ static ssize_t write_getfs(struct file *
struct auth_domain *clp;
int err = 0;
struct knfsd_fh *res;
+struct in6_addr in6;

if (size  sizeof(*data))
return -EINVAL;
@@ -236,7 +238,13 @@ static ssize_t write_getfs(struct file *
res = (struct knfsd_fh*)buf;

exp_readlock();
-if (!(clp = auth_unix_lookup(sin-sin_addr)))
+
+/* IPv6 address mapping */
+ipv6_addr_set(in6, 0, 0,
+htonl(0x),
+(((struct sockaddr_in *)data-gd_addr)-sin_addr.s_addr));
+
+if (!(clp = auth_unix_lookup(in6)))
err = -EPERM;
else {
err = exp_rootfh(clp, data-gd_path, res, data-gd_maxlen);
@@ -257,6 +265,7 @@ static ssize_t write_getfd(struct file *
int err = 0;
struct knfsd_fh fh;
char *res;
+struct in6_addr in6;

if (size  sizeof(*data))
return -EINVAL;
@@ -271,7 +280,12 @@ static ssize_t write_getfd(struct file *
res = buf;
sin = (struct sockaddr_in *)data-gd_addr;
exp_readlock();
-if (!(clp = auth_unix_lookup(sin-sin_addr)))
+/* IPv6 address mapping */
+ipv6_addr_set(in6, 0, 0,
+htonl(0x),
+(((struct sockaddr_in *)data-gd_addr)-sin_addr.s_addr));
+
+if (!(clp = auth_unix_lookup(in6)))
err = -EPERM;
else {
err = exp_rootfh(clp, data-gd_path, fh, NFS_FHSIZE);
diff -p -u -r -N linux-2.6.24-rc4/include/linux/sunrpc/svcauth.h 
linux-2.6.24-rc4-IPv6-cache-based/include/linux/sunrpc/svcauth.h
--- linux-2.6.24-rc4/include/linux/sunrpc/svcauth.h2007-12-10 
16:01:43.0 +0100
+++ linux-2.6.24-rc4-IPv6-cache-based/include/linux/sunrpc/svcauth.h
2007-12-10 17:09:34.0 +0100

@@ -120,10 +120,10 @@ extern voidsvc_auth_unregister(rpc_auth

extern struct auth_domain *unix_domain_find(char *name);
extern void auth_domain_put(struct auth_domain *item);
-extern int auth_unix_add_addr(struct in_addr addr, struct auth_domain 
*dom);
+extern int auth_unix_add_addr(struct in6_addr *addr, struct auth_domain 
*dom);
extern struct auth_domain *auth_domain_lookup(char *name, struct 
auth_domain *new);

extern struct auth_domain *auth_domain_find(char *name);
-extern struct auth_domain *auth_unix_lookup(struct in_addr addr);
+extern struct auth_domain *auth_unix_lookup(struct in6_addr *addr);
extern int auth_unix_forget_old(struct auth_domain *dom);
extern void svcauth_unix_purge(void);
extern void svcauth_unix_info_release(void *);
diff -p -u -r -N linux-2.6.24-rc4/net/sunrpc/svcauth_unix.c 
linux-2.6.24-rc4-IPv6-cache-based/net/sunrpc/svcauth_unix.c
--- linux-2.6.24-rc4/net/sunrpc/svcauth_unix.c2007-12-10 
16:01:46.0 +0100
+++ linux-2.6.24-rc4-IPv6-cache-based/net/sunrpc/svcauth_unix.c
2007-12-10 17:38:50.0 +0100

@@ -11,7 +11,8 @@
#include linux/hash.h
#include linux/string.h
#include net/sock.h
-
+#include net/ipv6.h
+#include linux/kernel.h
#define RPCDBG_FACILITYRPCDBG_AUTH


@@ -84,7 +85,7 @@ static void svcauth_unix_domain_release(
struct ip_map {
struct cache_headh;
charm_class[8]; /* e.g. nfsd */
-struct in_addr

Re: [PATCH] IPv6 support for NFS server

2007-12-10 Thread J. Bruce Fields

On Mon, Dec 10, 2007 at 07:34:41PM +0100, Aurélien Charbon wrote:

 Here is a cleanup for the ip_map caching patch in nfs server.

 It prepares for IPv6 text-based mounts and exports.

 Tests: tested with only IPv4 network and basic nfs ops (mount, file 
 creation and modification)

Thanks!  And also tested with an unmodified rpc.mountd?

--b.


 -

 Signed-off-by: Aurelien Charbon [EMAIL PROTECTED]

 diff -p -u -r -N linux-2.6.24-rc4/fs/nfsd/export.c 
 linux-2.6.24-rc4-IPv6-cache-based/fs/nfsd/export.c
 --- linux-2.6.24-rc4/fs/nfsd/export.c2007-12-10 16:11:37.0 +0100
 +++ linux-2.6.24-rc4-IPv6-cache-based/fs/nfsd/export.c2007-12-10 
 17:50:37.0 +0100
 @@ -35,6 +35,7 @@
 #include linux/lockd/bind.h
 #include linux/sunrpc/msg_prot.h
 #include linux/sunrpc/gss_api.h
 +#include net/ipv6.h

 #define NFSDDBG_FACILITYNFSDDBG_EXPORT

 @@ -1556,6 +1557,7 @@ exp_addclient(struct nfsctl_client *ncp)
 {
 struct auth_domain*dom;
 inti, err;
 +struct in6_addr addr6;

 /* First, consistency check. */
 err = -EINVAL;
 @@ -1574,9 +1576,12 @@ exp_addclient(struct nfsctl_client *ncp)
 goto out_unlock;

 /* Insert client into hashtable. */
 -for (i = 0; i  ncp-cl_naddr; i++)
 -auth_unix_add_addr(ncp-cl_addrlist[i], dom);
 -
 +for (i = 0; i  ncp-cl_naddr; i++) {
 +/* Mapping address */
 +ipv6_addr_set(addr6, 0, 0,
 +htonl(0x), ncp-cl_addrlist[i].s_addr);
 +auth_unix_add_addr(addr6, dom);
 +}
 auth_unix_forget_old(dom);
 auth_domain_put(dom);

 diff -p -u -r -N linux-2.6.24-rc4/fs/nfsd/nfsctl.c 
 linux-2.6.24-rc4-IPv6-cache-based/fs/nfsd/nfsctl.c
 --- linux-2.6.24-rc4/fs/nfsd/nfsctl.c2007-12-10 16:11:37.0 +0100
 +++ linux-2.6.24-rc4-IPv6-cache-based/fs/nfsd/nfsctl.c2007-12-10 
 18:15:22.0 +0100
 @@ -37,6 +37,7 @@
 #include linux/nfsd/syscall.h

 #include asm/uaccess.h
 +#include net/ipv6.h

 /*
  *We have a single directory with 9 nodes in it.
 @@ -222,6 +223,7 @@ static ssize_t write_getfs(struct file *
 struct auth_domain *clp;
 int err = 0;
 struct knfsd_fh *res;
 +struct in6_addr in6;

 if (size  sizeof(*data))
 return -EINVAL;
 @@ -236,7 +238,13 @@ static ssize_t write_getfs(struct file *
 res = (struct knfsd_fh*)buf;

 exp_readlock();
 -if (!(clp = auth_unix_lookup(sin-sin_addr)))
 +
 +/* IPv6 address mapping */
 +ipv6_addr_set(in6, 0, 0,
 +htonl(0x),
 +(((struct sockaddr_in *)data-gd_addr)-sin_addr.s_addr));
 +
 +if (!(clp = auth_unix_lookup(in6)))
 err = -EPERM;
 else {
 err = exp_rootfh(clp, data-gd_path, res, data-gd_maxlen);
 @@ -257,6 +265,7 @@ static ssize_t write_getfd(struct file *
 int err = 0;
 struct knfsd_fh fh;
 char *res;
 +struct in6_addr in6;

 if (size  sizeof(*data))
 return -EINVAL;
 @@ -271,7 +280,12 @@ static ssize_t write_getfd(struct file *
 res = buf;
 sin = (struct sockaddr_in *)data-gd_addr;
 exp_readlock();
 -if (!(clp = auth_unix_lookup(sin-sin_addr)))
 +/* IPv6 address mapping */
 +ipv6_addr_set(in6, 0, 0,
 +htonl(0x),
 +(((struct sockaddr_in *)data-gd_addr)-sin_addr.s_addr));
 +
 +if (!(clp = auth_unix_lookup(in6)))
 err = -EPERM;
 else {
 err = exp_rootfh(clp, data-gd_path, fh, NFS_FHSIZE);
 diff -p -u -r -N linux-2.6.24-rc4/include/linux/sunrpc/svcauth.h 
 linux-2.6.24-rc4-IPv6-cache-based/include/linux/sunrpc/svcauth.h
 --- linux-2.6.24-rc4/include/linux/sunrpc/svcauth.h2007-12-10 
 16:01:43.0 +0100
 +++ linux-2.6.24-rc4-IPv6-cache-based/include/linux/sunrpc/svcauth.h
 2007-12-10 17:09:34.0 +0100
 @@ -120,10 +120,10 @@ extern voidsvc_auth_unregister(rpc_auth

 extern struct auth_domain *unix_domain_find(char *name);
 extern void auth_domain_put(struct auth_domain *item);
 -extern int auth_unix_add_addr(struct in_addr addr, struct auth_domain 
 *dom);
 +extern int auth_unix_add_addr(struct in6_addr *addr, struct auth_domain 
 *dom);
 extern struct auth_domain *auth_domain_lookup(char *name, struct 
 auth_domain *new);
 extern struct auth_domain *auth_domain_find(char *name);
 -extern struct auth_domain *auth_unix_lookup(struct in_addr addr);
 +extern struct auth_domain *auth_unix_lookup(struct in6_addr *addr);
 extern int auth_unix_forget_old(struct auth_domain *dom);
 extern void svcauth_unix_purge(void);
 extern void svcauth_unix_info_release(void *);
 diff -p -u -r -N linux-2.6.24-rc4/net/sunrpc/svcauth_unix.c 
 linux-2.6.24-rc4-IPv6-cache-based/net/sunrpc/svcauth_unix.c
 --- linux-2.6.24-rc4/net/sunrpc/svcauth_unix.c2007-12-10 
 16:01:46.0 +0100
 +++ linux-2.6.24-rc4-IPv6-cache-based/net/sunrpc/svcauth_unix.c
 2007-12-10 17:38:50.0 +0100
 @@ -11,7 +11,8 @@
 #include linux/hash.h
 #include linux/string.h
 #include net/sock.h
 -
 +#include

Re: [PATCH RFC] [3/9] modpost: Declare the modpost error functions as printf like

On Thu, Nov 22, 2007 at 03:43:08AM +0100, Andi Kleen wrote:
 
 This way gcc can warn for wrong format strings

This loks good. Can I get i s-o-b then I will apply it.

Sam


 
 ---
  scripts/mod/modpost.c |8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)
 
 Index: linux/scripts/mod/modpost.c
 ===
 --- linux.orig/scripts/mod/modpost.c
 +++ linux/scripts/mod/modpost.c
 @@ -33,7 +33,9 @@ enum export {
   export_unused_gpl, export_gpl_future, export_unknown
  };
  
 -void fatal(const char *fmt, ...)
 +#define PRINTF __attribute__ ((format (printf, 1, 2)))
 +
 +PRINTF void fatal(const char *fmt, ...)
  {
   va_list arglist;
  
 @@ -46,7 +48,7 @@ void fatal(const char *fmt, ...)
   exit(1);
  }
  
 -void warn(const char *fmt, ...)
 +PRINTF void warn(const char *fmt, ...)
  {
   va_list arglist;
  
 @@ -57,7 +59,7 @@ void warn(const char *fmt, ...)
   va_end(arglist);
  }
  
 -void merror(const char *fmt, ...)
 +PRINTF void merror(const char *fmt, ...)
  {
   va_list arglist;
  
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC] [4/9] modpost: Fix format string warnings

On Thu, Nov 22, 2007 at 03:43:09AM +0100, Andi Kleen wrote:
 
 Fix wrong format strings in modpost exposed by the previous patch.
 Including one missing argument -- some random data was printed instead.

Looks good. Can I get a s-o-b then I will apply it.

Sam

 
 ---
  scripts/mod/modpost.c |7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)
 
 Index: linux/scripts/mod/modpost.c
 ===
 --- linux.orig/scripts/mod/modpost.c
 +++ linux/scripts/mod/modpost.c
 @@ -388,7 +388,7 @@ static int parse_elf(struct elf_info *in
  
   /* Check if file offset is correct */
   if (hdr-e_shoff  info-size) {
 - fatal(section header offset=%u in file '%s' is bigger then 
 filesize=%lu\n, hdr-e_shoff, filename, info-size);
 + fatal(section header offset=%lu in file '%s' is bigger then 
 filesize=%lu\n, (unsigned long)hdr-e_shoff, filename, info-size);
   return 0;
   }
  
 @@ -409,7 +409,7 @@ static int parse_elf(struct elf_info *in
   const char *secname;
  
   if (sechdrs[i].sh_offset  info-size) {
 - fatal(%s is truncated. sechdrs[i].sh_offset=%u  
 sizeof(*hrd)=%ul\n, filename, (unsigned int)sechdrs[i].sh_offset, 
 sizeof(*hdr));
 + fatal(%s is truncated. sechdrs[i].sh_offset=%lu  
 sizeof(*hrd)=%lu\n, filename, (unsigned long)sechdrs[i].sh_offset, 
 sizeof(*hdr));
   return 0;
   }
   secname = secstrings + sechdrs[i].sh_name;
 @@ -907,7 +907,8 @@ static void warn_sec_mismatch(const char
before '%s' (at offset -0x%llx)\n,
modname, fromsec, (unsigned long long)r.r_offset,
secname, refsymname,
 -  elf-strtab + after-st_name);
 +  elf-strtab + after-st_name,
 +  (unsigned long long)r.r_offset);
   } else {
   warn(%s(%s+0x%llx): Section mismatch: reference to %s:%s\n,
modname, fromsec, (unsigned long long)r.r_offset,
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC] [3/9] modpost: Declare the modpost error functions as printf like

2007-12-10 Thread Andi Kleen

On Mon, Dec 10, 2007 at 07:50:08PM +0100, Sam Ravnborg wrote:
 On Thu, Nov 22, 2007 at 03:43:08AM +0100, Andi Kleen wrote:
  
  This way gcc can warn for wrong format strings
 
 This loks good. Can I get i s-o-b then I will apply it.

Sorry must have been left out by mistake.

Signed-off-by: Andi Kleen [EMAIL PROTECTED]

for both patches.

-Andi

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/4] DST: Distributed storage documentation.

On Mon, 2007-12-10 at 17:50 +0300, Evgeniy Polyakov wrote:
 On Mon, Dec 10, 2007 at 03:31:48PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
 wrote:
   I meant that for each new device, it will be placed into
   /sys/devices/its_name, but it can also be accessed via
   /sys/bus/dst/devices/
  
  Still, it looks like a path. :)
  
  Please don't reference any device directly with a /sys/devices/ path.
  You have to use the subsystem links to the devices
  in /sys/bus/dst/devices/. Devices are free to move around
  in /sys/devices, even during runtime. Yours don't do, but anyway, please
  remove all mentioning of direct access to /sys/devices/.
 
 Ok, I will update documentation to reference /sys/bus/dst/devices
 instead of /sys/devices

Great, thanks!

  Btw, where is the top-level /sys/devices/storage/ coming from? I don't
  see that in the code. We don't accept any new virtual parents here.
 
  Your devices will automatically appear in /sys/devices/virtual/dst/, and
  not below your own parent. But that path does not matter anyway, because
  you should only access them from the /sys/bus/dst/devices/ directory.
  
  And in general please don't claim generic names like storage in any
  namespace for a very specific subsystem like this.
 
 It is not a parent - it is an example for device called 'storage', if it
 will be called 'testing', then path will be /sys/devices/testing or more
 correct /sys/bus/dst/devices/testing :)

Ah, I see.

   It is 'dst' bus.
   
   uganda:~/codes# ls -la /sys/devices/staorge/
   total 0
   drwxr-xr-x 4 root root0 2007-12-10 11:46 .
   drwxr-xr-x 9 root root0 2007-12-10 11:46 ..
   -r--r--r-- 1 root root 4096 2007-12-10 11:46 alg
   lrwxrwxrwx 1 root root0 2007-12-10 11:46 bus - ../../bus/dst
   drwxr-xr-x 3 root root0 2007-12-10 11:46 n-0-81003e24117
   -r--r--r-- 1 root root 4096 2007-12-10 11:46 name
   -r--r--r-- 1 root root 4096 2007-12-10 11:46 nodes
   drwxr-xr-x 2 root root0 2007-12-10 11:46 power
   -rw-r--r-- 1 root root 4096 2007-12-10 11:46 remove_all_nodes
   lrwxrwxrwx 1 root root0 2007-12-10 11:46 subsystem - ../../bus/dst
   -rw-r--r-- 1 root root 4096 2007-12-10 11:46 uevent
  
  Ok, how does:
ls -l /sys/devices/storage/n-0-81003e24117
  look?
 
 uganda:~/codes# ls -l /sys/devices/storage/n-0-81003ebc220/
 total 0
 drwxr-xr-x 2 root root0 2007-12-10 13:23 power
 -r--r--r-- 1 root root 4096 2007-12-10 13:30 size
 -r--r--r-- 1 root root 4096 2007-12-10 13:30 start
 -r--r--r-- 1 root root 4096 2007-12-10 13:30 type
 -rw-r--r-- 1 root root 4096 2007-12-10 13:30 uevent

This is a struct device instance without a subsystem (bus/class),
right? It will not send an uevent to userspace. Is that intended? Why
don't you add them all to the dst bus? 

   uganda:~/codes# ls -l /sys/bus/dst/
   total 0
   drwxr-xr-x 2 root root0 2007-12-10 09:52 devices
   drwxr-xr-x 2 root root0 2007-12-10 09:52 drivers
   -rw-r--r-- 1 root root 4096 2007-12-10 11:46 drivers_autoprobe
   --w--- 1 root root 4096 2007-12-10 11:46 drivers_probe
  
  How does:
ls -l /sys/bus/dst/devices
  look?
 
 uganda:~/codes# ls -la /sys/bus/dst/devices/
 total 0
 drwxr-xr-x 2 root root 0 2007-12-10 13:30 .
 drwxr-xr-x 4 root root 0 2007-12-10 13:22 ..
 lrwxrwxrwx 1 root root 0 2007-12-10 13:30 storage - ../../../devices/storage
 
 Here 'storage' is just a name for device called 'storage', it can be
 anything else.

Fine.

  Further questions:
  Why do you do your own refcounting instead of using kref?
 
 That's because I always used atomic operations as a reference counters
 and did not tried krefs :)
 They are the same actually (module tricky arches where smp_mb_* are
 required), so I can replace them in the next release.

On Mon, 2007-12-10 at 18:12 +0300, Evgeniy Polyakov wrote:
 Actually not - I have to set reference counter to something other than 1
 or +/- 1, and thus will have to call kref_get() in a loop, which is a
 very ugly step. Is there kref_set() or somethinglike that? At least not
 in 2.6.22 what I'm using for now.

Yeah, a loop would look pretty ugly. How about just adding kref_set(),
if you need it.

  Why don't you use groups for the attributes?
 
 For 3-4 attributes it is faster to register them in a loop than typing
 another structure :)

Yeah, but if you would need to recover from an error when the creation
of a file fails, a group would do the proper rollback.

  Why don't you use default attributes for the device, where you get all
  error handling done by the core.
 
 What is 'default attributes' and for what devices?
 All my sysfs files are so much trivial, so they do not need anything
 special and I do not see what is error handling you mentioned.

If all devices of a subsystem (bus/class) are of the same type, you can
set a default array of attributes in the struct bus/class to be
created at every device. If you have multiple types of devices in the
same subsytem (bus/class) you can to assign a the device_type, which
has the

Re: [PATCH RFC] [5/9] modpost: Fix a buffer overflow in modpost

On Thu, Nov 22, 2007 at 03:43:10AM +0100, Andi Kleen wrote:
 
 When passing an file name  1k the stack could be overflowed.
 Not really a security issue, but still better plugged.

Looks good. A s-o-b line again please.
Although I am not so happy with the ue of gcc extensions.

Sam
 
 
 ---
  scripts/mod/modpost.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 Index: linux/scripts/mod/modpost.c
 ===
 --- linux.orig/scripts/mod/modpost.c
 +++ linux/scripts/mod/modpost.c
 @@ -1656,7 +1656,6 @@ int main(int argc, char **argv)
  {
   struct module *mod;
   struct buffer buf = { };
 - char fname[SZ];
   char *kernel_read = NULL, *module_read = NULL;
   char *dump_write = NULL;
   int opt;
 @@ -1709,6 +1708,8 @@ int main(int argc, char **argv)
   err = 0;
  
   for (mod = modules; mod; mod = mod-next) {
 + char fname[strlen(mod-name) + 10];
 +
   if (mod-skip)
   continue;
  
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] drivers/net/ipg: Remove local definition of TRUE/FALSE

2007-12-10 Thread Richard Knutsson

Remove local definition of TRUE/FALSE.

Signed-off-by: Richard Knutsson [EMAIL PROTECTED]
---


diff --git a/drivers/net/ipg.h b/drivers/net/ipg.h
index d5d092c..4484778 100644
--- a/drivers/net/ipg.h
+++ b/drivers/net/ipg.h
@@ -490,38 +490,34 @@ enum ipg_regs {
  * Tune
  */
 
-/* Miscellaneous Constants. */
-#define   TRUE  1
-#define   FALSE 0
-
 /* Assign IPG_APPEND_FCS_ON_TX  0 for auto FCS append on TX. */
-#define IPG_APPEND_FCS_ON_TX TRUE
+#define IPG_APPEND_FCS_ON_TX true
 
 /* Assign IPG_APPEND_FCS_ON_TX  0 for auto FCS strip on RX. */
-#define IPG_STRIP_FCS_ON_RX  TRUE
+#define IPG_STRIP_FCS_ON_RX  true
 
 /* Assign IPG_DROP_ON_RX_ETH_ERRORS  0 to drop RX frames with
  * Ethernet errors.
  */
-#define IPG_DROP_ON_RX_ETH_ERRORSTRUE
+#define IPG_DROP_ON_RX_ETH_ERRORStrue
 
 /* Assign IPG_INSERT_MANUAL_VLAN_TAG  0 to insert VLAN tags manually
  * (via TFC).
  */
-#defineIPG_INSERT_MANUAL_VLAN_TAG   FALSE
+#defineIPG_INSERT_MANUAL_VLAN_TAG   false
 
 /* Assign IPG_ADD_IPCHECKSUM_ON_TX  0 for auto IP checksum on TX. */
-#define IPG_ADD_IPCHECKSUM_ON_TX FALSE
+#define IPG_ADD_IPCHECKSUM_ON_TX false
 
 /* Assign IPG_ADD_TCPCHECKSUM_ON_TX  0 for auto TCP checksum on TX.
  * DO NOT USE FOR SILICON REVISIONS B3 AND EARLIER.
  */
-#define IPG_ADD_TCPCHECKSUM_ON_TXFALSE
+#define IPG_ADD_TCPCHECKSUM_ON_TXfalse
 
 /* Assign IPG_ADD_UDPCHECKSUM_ON_TX  0 for auto UDP checksum on TX.
  * DO NOT USE FOR SILICON REVISIONS B3 AND EARLIER.
  */
-#define IPG_ADD_UDPCHECKSUM_ON_TXFALSE
+#define IPG_ADD_UDPCHECKSUM_ON_TXfalse
 
 /* If inserting VLAN tags manually, assign the IPG_MANUAL_VLAN_xx
  * constants as desired.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/4] DST: Distributed storage documentation.

On Mon, Dec 10, 2007 at 08:02:28PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
  uganda:~/codes# ls -l /sys/devices/storage/n-0-81003ebc220/
  total 0
  drwxr-xr-x 2 root root0 2007-12-10 13:23 power
  -r--r--r-- 1 root root 4096 2007-12-10 13:30 size
  -r--r--r-- 1 root root 4096 2007-12-10 13:30 start
  -r--r--r-- 1 root root 4096 2007-12-10 13:30 type
  -rw-r--r-- 1 root root 4096 2007-12-10 13:30 uevent
 
 This is a struct device instance without a subsystem (bus/class),
 right? It will not send an uevent to userspace. Is that intended? Why
 don't you add them all to the dst bus? 

I created dst bus for storage devices only, nodes are very different
objects, and actually they do not need any events from above, but I need
to put some attributes somewhere, so it is 'empty' device.

  Actually not - I have to set reference counter to something other than 1
  or +/- 1, and thus will have to call kref_get() in a loop, which is a
  very ugly step. Is there kref_set() or somethinglike that? At least not
  in 2.6.22 what I'm using for now.
 
 Yeah, a loop would look pretty ugly. How about just adding kref_set(),
 if you need it.

Well, then it distributed storage will not be able to build as
standalone module, and kref_set() itself will not be accepted as a single 
patch, since there are no in-kernel users :)
It is easily doable though.

   Why don't you use groups for the attributes?
  
  For 3-4 attributes it is faster to register them in a loop than typing
  another structure :)
 
 Yeah, but if you would need to recover from an error when the creation
 of a file fails, a group would do the proper rollback.

I do not care about such errors - if there is such an error for a file,
which exports information about type of the node (i.e. string L or R)
or some other very meaningful info, then system has enough to care about
instead of this, so dst does not do anything special - it ignores such
errors :)

On exit path it will be checked and removed correctly.
If there will be additional sysfs files, I think group is a good way to
implement them.

   Why don't you use default attributes for the device, where you get all
   error handling done by the core.
  
  What is 'default attributes' and for what devices?
  All my sysfs files are so much trivial, so they do not need anything
  special and I do not see what is error handling you mentioned.
 
 If all devices of a subsystem (bus/class) are of the same type, you can
 set a default array of attributes in the struct bus/class to be
 created at every device. If you have multiple types of devices in the
 same subsytem (bus/class) you can to assign a the device_type, which
 has the default attribute group.
 That way the core will create the files before the event is sent out to
 userspace, and the files can be access from the event itself. Not sure
 if that is needed for dst.

Ok, I see.

DST right now has 3 types of files - storage files, it is common for
every storage device; node files, which are the same for every node; and
per-algorithm private devices - they can be different (actually only
mirroring algorithm exports something to userspace).

I think it is possible to use default attributes for storage devices,
but node device does not have a bus/class, so they will be untouched.

 Thanks,
 Kay

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] iproute2: off by one in nested attribute parse

2007-12-10 Thread Stephen Hemminger

Fix off by one in nested attribute management.

Fixes segv in:
  tc qdisc show dev eth1
due to uninitialized attribute table.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 lib/libnetlink.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index 12883fe..d13596f 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -632,6 +632,6 @@ int __parse_rtattr_nested_compat(struct rtattr *tb[], int 
max, struct rtattr *rt
rta = RTA_DATA(rta) + RTA_ALIGN(len);
return parse_rtattr_nested(tb, max, rta);
}
-   memset(tb, 0, sizeof(struct rtattr *) * max);
+   memset(tb, 0, sizeof(struct rtattr *) * (max + 1));
return 0;
 }
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/4] DST: Distributed storage documentation.

On Mon, 2007-12-10 at 22:33 +0300, Evgeniy Polyakov wrote:
 On Mon, Dec 10, 2007 at 08:02:28PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
 wrote:
   uganda:~/codes# ls -l /sys/devices/storage/n-0-81003ebc220/
   total 0
   drwxr-xr-x 2 root root0 2007-12-10 13:23 power
   -r--r--r-- 1 root root 4096 2007-12-10 13:30 size
   -r--r--r-- 1 root root 4096 2007-12-10 13:30 start
   -r--r--r-- 1 root root 4096 2007-12-10 13:30 type
   -rw-r--r-- 1 root root 4096 2007-12-10 13:30 uevent
  
  This is a struct device instance without a subsystem (bus/class),
  right? It will not send an uevent to userspace. Is that intended? Why
  don't you add them all to the dst bus? 
 
 I created dst bus for storage devices only, nodes are very different
 objects, and actually they do not need any events from above, but I need
 to put some attributes somewhere, so it is 'empty' device.

Ok.

   Actually not - I have to set reference counter to something other than 1
   or +/- 1, and thus will have to call kref_get() in a loop, which is a
   very ugly step. Is there kref_set() or somethinglike that? At least not
   in 2.6.22 what I'm using for now.
  
  Yeah, a loop would look pretty ugly. How about just adding kref_set(),
  if you need it.
 
 Well, then it distributed storage will not be able to build as
 standalone module, and kref_set() itself will not be accepted as a single 
 patch, since there are no in-kernel users :)
 It is easily doable though.

Most rules have exceptions. :) Send a patch, so we can see how it looks
like.

Why don't you use groups for the attributes?
   
   For 3-4 attributes it is faster to register them in a loop than typing
   another structure :)
  
  Yeah, but if you would need to recover from an error when the creation
  of a file fails, a group would do the proper rollback.
 
 I do not care about such errors - if there is such an error for a file,
 which exports information about type of the node (i.e. string L or R)
 or some other very meaningful info, then system has enough to care about
 instead of this, so dst does not do anything special - it ignores such
 errors :)
 
 On exit path it will be checked and removed correctly.
 If there will be additional sysfs files, I think group is a good way to
 implement them.
 
Why don't you use default attributes for the device, where you get all
error handling done by the core.
   
   What is 'default attributes' and for what devices?
   All my sysfs files are so much trivial, so they do not need anything
   special and I do not see what is error handling you mentioned.
  
  If all devices of a subsystem (bus/class) are of the same type, you can
  set a default array of attributes in the struct bus/class to be
  created at every device. If you have multiple types of devices in the
  same subsytem (bus/class) you can to assign a the device_type, which
  has the default attribute group.
  That way the core will create the files before the event is sent out to
  userspace, and the files can be access from the event itself. Not sure
  if that is needed for dst.
 
 Ok, I see.
 
 DST right now has 3 types of files - storage files, it is common for
 every storage device; node files, which are the same for every node; and
 per-algorithm private devices - they can be different (actually only
 mirroring algorithm exports something to userspace).
 
 I think it is possible to use default attributes for storage devices,
 but node device does not have a bus/class, so they will be untouched.

Sounds fine.

Thanks,
Kay

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/4] DST: Distributed storage documentation.

On Mon, Dec 10, 2007 at 08:44:55PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
Actually not - I have to set reference counter to something other than 1
or +/- 1, and thus will have to call kref_get() in a loop, which is a
very ugly step. Is there kref_set() or somethinglike that? At least not
in 2.6.22 what I'm using for now.
   
   Yeah, a loop would look pretty ugly. How about just adding kref_set(),
   if you need it.
  
  Well, then it distributed storage will not be able to build as
  standalone module, and kref_set() itself will not be accepted as a single 
  patch, since there are no in-kernel users :)
  It is easily doable though.
 
 Most rules have exceptions. :) Send a patch, so we can see how it looks
 like.

It looks really non-trivial :)

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/include/linux/kref.h b/include/linux/kref.h
index 6fee353..5d18563 100644
--- a/include/linux/kref.h
+++ b/include/linux/kref.h
@@ -24,6 +24,7 @@ struct kref {
atomic_t refcount;
 };
 
+void kref_set(struct kref *kref, int num);
 void kref_init(struct kref *kref);
 void kref_get(struct kref *kref);
 int kref_put(struct kref *kref, void (*release) (struct kref *kref));
diff --git a/lib/kref.c b/lib/kref.c
index a6dc3ec..40aa9f9 100644
--- a/lib/kref.c
+++ b/lib/kref.c
@@ -15,13 +15,23 @@
 #include linux/module.h
 
 /**
+ * kref_set - initialize object and set refcount to requested number.
+ * @kref: object in question.
+ * @num: initial reference counter
+ */
+void kref_set(struct kref *kref, int num)
+{
+   atomic_set(kref-refcount, num);
+   smp_mb();
+}
+
+/**
  * kref_init - initialize object.
  * @kref: object in question.
  */
 void kref_init(struct kref *kref)
 {
-   atomic_set(kref-refcount,1);
-   smp_mb();
+   kref_set(kref, 1);
 }
 
 /**

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] drivers/net/ipg: Remove local definition of TRUE/FALSE

2007-12-10 Thread Pekka Enberg

Hi Richard,

On Dec 10, 2007 9:29 PM, Richard Knutsson [EMAIL PROTECTED] wrote:
 Remove local definition of TRUE/FALSE.

This is already fixed in Francois' tree:

http://git.kernel.org/?p=linux/kernel/git/romieu/netdev-2.6.git;a=commitdiff;h=2af61e99e3d1c959840ea007ff56b15db794fb99

   Pekka
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/4] DST: Distributed storage documentation.

On Mon, 2007-12-10 at 22:51 +0300, Evgeniy Polyakov wrote:
 On Mon, Dec 10, 2007 at 08:44:55PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
 wrote:
 Actually not - I have to set reference counter to something other 
 than 1
 or +/- 1, and thus will have to call kref_get() in a loop, which is a
 very ugly step. Is there kref_set() or somethinglike that? At least 
 not
 in 2.6.22 what I'm using for now.

Yeah, a loop would look pretty ugly. How about just adding kref_set(),
if you need it.
   
   Well, then it distributed storage will not be able to build as
   standalone module, and kref_set() itself will not be accepted as a single 
   patch, since there are no in-kernel users :)
   It is easily doable though.
  
  Most rules have exceptions. :) Send a patch, so we can see how it looks
  like.
 
 It looks really non-trivial :)

Yeah, it does. :)
We miss an EXPORT_SYMBOL(), right?

Kay

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC] [5/9] modpost: Fix a buffer overflow in modpost

2007-12-10 Thread Andi Kleen

On Monday 10 December 2007 20:32, Sam Ravnborg wrote:
 On Thu, Nov 22, 2007 at 03:43:10AM +0100, Andi Kleen wrote:
  When passing an file name  1k the stack could be overflowed.
  Not really a security issue, but still better plugged.

 Looks good. A s-o-b line again please.

Signed-off-by: Andi Kleen [EMAIL PROTECTED]

 Although I am not so happy with the ue of gcc extensions.

That's not a gcc extension. It's C99.

-Andi
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/4] DST: Distributed storage documentation.

On Mon, Dec 10, 2007 at 08:56:49PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
 On Mon, 2007-12-10 at 22:51 +0300, Evgeniy Polyakov wrote:
  On Mon, Dec 10, 2007 at 08:44:55PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
  wrote:
  Actually not - I have to set reference counter to something other 
  than 1
  or +/- 1, and thus will have to call kref_get() in a loop, which is 
  a
  very ugly step. Is there kref_set() or somethinglike that? At least 
  not
  in 2.6.22 what I'm using for now.
 
 Yeah, a loop would look pretty ugly. How about just adding kref_set(),
 if you need it.

Well, then it distributed storage will not be able to build as
standalone module, and kref_set() itself will not be accepted as a 
single 
patch, since there are no in-kernel users :)
It is easily doable though.
   
   Most rules have exceptions. :) Send a patch, so we can see how it looks
   like.
  
  It looks really non-trivial :)
 
 Yeah, it does. :)
 We miss an EXPORT_SYMBOL(), right?

Yep :)

diff --git a/include/linux/kref.h b/include/linux/kref.h
index 6fee353..5d18563 100644
--- a/include/linux/kref.h
+++ b/include/linux/kref.h
@@ -24,6 +24,7 @@ struct kref {
atomic_t refcount;
 };
 
+void kref_set(struct kref *kref, int num);
 void kref_init(struct kref *kref);
 void kref_get(struct kref *kref);
 int kref_put(struct kref *kref, void (*release) (struct kref *kref));
diff --git a/lib/kref.c b/lib/kref.c
index a6dc3ec..9ecd6e8 100644
--- a/lib/kref.c
+++ b/lib/kref.c
@@ -15,13 +15,23 @@
 #include linux/module.h
 
 /**
+ * kref_set - initialize object and set refcount to requested number.
+ * @kref: object in question.
+ * @num: initial reference counter
+ */
+void kref_set(struct kref *kref, int num)
+{
+   atomic_set(kref-refcount, num);
+   smp_mb();
+}
+
+/**
  * kref_init - initialize object.
  * @kref: object in question.
  */
 void kref_init(struct kref *kref)
 {
-   atomic_set(kref-refcount,1);
-   smp_mb();
+   kref_set(kref, 1);
 }
 
 /**
@@ -61,6 +71,7 @@ int kref_put(struct kref *kref, void (*release)(struct kref 
*kref))
return 0;
 }
 
+EXPORT_SYMBOL(kref_set);
 EXPORT_SYMBOL(kref_init);
 EXPORT_SYMBOL(kref_get);
 EXPORT_SYMBOL(kref_put);

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.24-rc4-mm1

On Mon, 10 Dec 2007, Ilpo Järvinen wrote:

 Dave, please include this one to net-2.6.25.

...

 --
 [PATCH] [TCP]: Fix fack_count miscountings (multiple places)

I've better version of this coming up, so Dave please don't put this one 
into net-2.6.25 (noticed that both the original and the after patch code 
can get to an infinite loop and the new code is flawed in some rare cases 
still as well). I'll submit a better version soon.

-- 
 i.

Re: [PATCH RFC] [5/9] modpost: Fix a buffer overflow in modpost

On Mon, Dec 10, 2007 at 08:57:28PM +0100, Andi Kleen wrote:
 On Monday 10 December 2007 20:32, Sam Ravnborg wrote:
  On Thu, Nov 22, 2007 at 03:43:10AM +0100, Andi Kleen wrote:
   When passing an file name  1k the stack could be overflowed.
   Not really a security issue, but still better plugged.
 
  Looks good. A s-o-b line again please.
 
 Signed-off-by: Andi Kleen [EMAIL PROTECTED]
 
  Although I am not so happy with the ue of gcc extensions.
 
 That's not a gcc extension. It's C99.

OK.
I have applied all three patches to kbuild.git.

As I did not follow the whole thread about the namespace I did not
take those.
And the first patch touching module.c should go in via akpm I think.
It is outside my core-competence area at least .

Sam
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] drivers/net/ipg: Remove local definition of TRUE/FALSE

2007-12-10 Thread Richard Knutsson


Pekka Enberg wrote:

Hi Richard,

On Dec 10, 2007 9:29 PM, Richard Knutsson [EMAIL PROTECTED] wrote:
  

Remove local definition of TRUE/FALSE.



This is already fixed in Francois' tree:

http://git.kernel.org/?p=linux/kernel/git/romieu/netdev-2.6.git;a=commitdiff;h=2af61e99e3d1c959840ea007ff56b15db794fb99
  

I see, thanks.

Richard Knutsson

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 8/8] [PATCH v2] [CCID3]: Interface CCID3 code with newer Loss Intervals Database

2007-12-10 Thread Arnaldo Carvalho de Melo

Em Sat, Dec 08, 2007 at 10:06:28AM +, Gerrit Renker escreveu:
 This hooks up the TFRC Loss Interval database with CCID 3 packet reception.
 In addition, it makes the CCID-specific computation of the first loss
 interval (which requires access to all the guts of CCID3) local to ccid3.c.
 
 The patch also fixes an omission in the DCCP code, that of a default /
 fallback RTT value (defined in section 3.4 of RFC 4340 as 0.2 sec); while
 at it, the  upper bound of 4 seconds for an RTT sample has  been reduced to
 match the initial TCP RTO value of 3 seconds from[RFC 1122, 4.2.3.1].
 
 Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
 Signed-off-by: Ian McDonald [EMAIL PROTECTED]

When interfacing we must make sure that ccid3 tfrc_lh_slab is created
and then tfrc_li_cachep is not needed. I'm doing this while keeping
the structure of the patches, i.e. one introducing, the other removing.
But we need to create tfrc_lh_slab if we want the tree to be bisectable.

I'm doing this and keeping your Signed-off-line, please holler if you
disagree for some reason.

- Arnaldo
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.24-rc4-mm1

2007-12-10 Thread Andrew Morton

On Tue, 11 Dec 2007 01:48:39 +1100
Reuben Farrelly [EMAIL PROTECTED] wrote:

On 5/12/2007 4:17 PM, Andrew Morton wrote:
Temporarily at

http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/

Will appear later at

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/

- Lots of device IDs have been removed from the e1000 driver and moved over
to e1000e. So if your e1000 stops working, you forgot to set
CONFIG_E1000E.

- The s390 build is still broken.

I'm seeing this most incredibly unhelpful (to debug) but fortunately
reproduceable problem (so far 4/4 times) on this -mm kernel. I thought this
problem may have been related to another bug which I have reported (A TCP
oops)
but even after applying a likely fix for that I am still seeing this problem.

The machine boots up perfectly fine and runs good until I load it up.
In this case I can reliably cause this to occur by pulling a 3G ISO across the
GigE network from my Linux box to my PC. After maybe 50M or so, the console
just displays this (ignore initial boot banner):

* Starting local ... [
ok ]

This is tornado.reub.net (Linux x86_64 2.6.24-rc4-mm1) 00:24:01

tornado login: *** buffer overf

---

Yes - after displaying the 'f' in what I can only guess is the word
'overflow',
the box spontaneously reboots. There is no further console output until it
starts to come back up again.

The problem does not exist in 2.6.23-gentoo kernels nor in a vanilla
2.6.24-rc4-git6 (phew!), so this looks to be an -mm only problem at this
stage.

I enabled a number of kernel debugging options but then I got no output at
all
when the machine crashed.

I'm at a bit of a loss as to which subsystem this might be coming from, so
I'm
not sure who to CC.

Box information is (still) up at
http://www.reub.net/files/kernel/2.6.24-rc4-mm1/

hm. grepping around for buffer overflow doesn't turn up anything except in
drivers which you won't be using on that machine.

I'd be suspecting networking, obviously. If you're feeling keen could you
please
grep a 2.6.24-rc4 tree and apply 2.6.24-rc4-mm1's origin.patch and git-net.patch
and see if the bug is still present?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH net-2.6.25] qdisc: new rate limiter (v2)

2007-12-10 Thread Stephen Hemminger

This is a time based rate limiter for use in network testing. When doing
network tests it is often useful to test at reduced bandwidths. The existing
Token Bucket Filter provides rate control, but causes bursty traffic that
can cause different performance than real world. Another alternative is
the PSPacer, but it depends on pause frames which may also cause issues.

The qdisc depends on high resolution timers and clocks, so it will probably
use more CPU than others making it a poor choice for use when doing traffic
shaping for QOS. 

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 include/linux/pkt_sched.h |   15 +
 net/sched/Kconfig |   13 +
 net/sched/Makefile|1 
 net/sched/sch_rlim.c  |  354 ++
 4 files changed, 383 insertions(+)

--- a/include/linux/pkt_sched.h 2007-12-10 13:08:36.0 -0800
+++ b/include/linux/pkt_sched.h 2007-12-10 13:08:39.0 -0800
@@ -475,4 +475,19 @@ struct tc_netem_corrupt
 
 #define NETEM_DIST_SCALE   8192
 
+enum
+{
+   TCA_RLIM_UNSPEC,
+   TCA_RLIM_PARMS,
+   __TCA_RLIM_MAX,
+};
+#define TCA_RLIM_MAX (__TCA_RLIM_MAX - 1)
+
+struct tc_rlim_qopt
+{
+   __u64   rate;   /* bits per sec */
+   __u32   overhead;   /* crc overhad */
+   __u32   limit;  /* fifo limit (packets) */
+};
+
 #endif
--- a/net/sched/Kconfig 2007-12-10 13:08:36.0 -0800
+++ b/net/sched/Kconfig 2007-12-10 13:08:39.0 -0800
@@ -196,6 +196,19 @@ config NET_SCH_NETEM
 
  If unsure, say N.
 
+config NET_SCH_RLIM
+   tristate Network Rate Limiter
+   ---help---
+ Say Y here if you want to use timer based network rate limiter
+ algorithm.
+
+ See the top of file:net/sched/sch_rlim.c for more details.
+
+ To compile this code as a module, choose M here: the
+ module will be called sch_rlim.
+
+ If unsure, say N.
+
 config NET_SCH_INGRESS
tristate Ingress Qdisc
---help---
--- a/net/sched/Makefile2007-12-10 13:08:36.0 -0800
+++ b/net/sched/Makefile2007-12-10 13:08:39.0 -0800
@@ -28,6 +28,7 @@ obj-$(CONFIG_NET_SCH_TEQL)+= sch_teql.o
 obj-$(CONFIG_NET_SCH_PRIO) += sch_prio.o
 obj-$(CONFIG_NET_SCH_ATM)  += sch_atm.o
 obj-$(CONFIG_NET_SCH_NETEM)+= sch_netem.o
+obj-$(CONFIG_NET_SCH_RLIM) += sch_rlim.o
 obj-$(CONFIG_NET_CLS_U32)  += cls_u32.o
 obj-$(CONFIG_NET_CLS_ROUTE4)   += cls_route.o
 obj-$(CONFIG_NET_CLS_FW)   += cls_fw.o
--- /dev/null   1970-01-01 00:00:00.0 +
+++ b/net/sched/sch_rlim.c  2007-12-10 13:26:39.0 -0800
@@ -0,0 +1,353 @@
+/*
+ * net/sched/sch_rate.cTimer based rate control
+ *
+ * Copyright (c) 2007 Stephen Hemminger [EMAIL PROTECTED]
+ *
+ */
+
+#include linux/module.h
+#include linux/types.h
+#include linux/kernel.h
+#include linux/string.h
+#include linux/errno.h
+#include linux/skbuff.h
+#include net/netlink.h
+#include net/pkt_sched.h
+#include asm/div64.h
+
+/* Simple Rate control
+
+   Algorthim used in NISTnet and others.
+   Logically similar to Token Bucket, but more real time and less lumpy.
+
+   A packet is not allowed to be dequeued until a after the deadline.
+   Each packet dequeued increases the deadline by rate * size.
+
+   If qdisc throttles, it starts a timer, which will wake it up
+   when it is ready to transmit. This scheduler works much better
+   if high resolution timers are available.
+
+   Like classful TBF, limit is just kept for backwards compatibility.
+   It is passed to the default pfifo qdisc - if the inner qdisc is
+   changed the limit is not effective anymore.
+
+*/
+
+/* Use scaled math to get 1/64 ns resolution */
+#define NSEC_SCALE 6
+
+struct rlim_sched_data {
+   ktime_t next_send;  /* next scheduled departure */
+   u64 cost;   /* nsec/byte * 64 */
+   u32 overhead;   /* crc/preamble bytes */
+   u32 limit;  /* upper bound on fifo (packets) */
+
+   struct Qdisc *qdisc;/* Inner qdisc, default - bfifo queue */
+   struct qdisc_watchdog watchdog;
+};
+
+static int rlim_enqueue(struct sk_buff *skb, struct Qdisc *sch)
+{
+   struct rlim_sched_data *q = qdisc_priv(sch);
+   int ret;
+
+   ret = q-qdisc-enqueue(skb, q-qdisc);
+   if (ret)
+   sch-qstats.drops++;
+   else {
+   sch-q.qlen++;
+   sch-bstats.bytes += skb-len;
+   sch-bstats.packets++;
+   }
+
+   return ret;
+}
+
+
+static u64 pkt_time(const struct rlim_sched_data *q,
+   const struct sk_buff *skb)
+{
+   return (q-cost * (skb-len + q-overhead))  NSEC_SCALE;
+}
+
+static unsigned int rlim_drop(struct Qdisc *sch)
+{
+   struct rlim_sched_data *q = qdisc_priv(sch);
+   unsigned int len = 0;
+
+   if (q-qdisc-ops-drop  (len = q-qdisc-ops-drop(q-qdisc)) != 0) {
+

[PATCH 0/4] Pull request for 'sis190' branch

Please pull from branch 'sis190' in repository

git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6.git sis190

to get the changes below.

Distance from 'upstream-linus' (7962024e9d16e9349d76b553326f3fa7be64305e)
-

c27e14e508664471b8e44ef1f81ec080213ea314
348de67fe200e25d8cb80cff35642192436cfeda
004a22d03d62cd08e5287273a5143447db009cd0
14deb44ffe7220be2de697d616f28cce17e72297

Diffstat


 drivers/net/sis190.c |   21 ++---
 1 files changed, 10 insertions(+), 11 deletions(-)

Shortlog


Francois Romieu (4):
  sis190: add cmos ram access code for the SiS19x/968 chipset pair
  sis190: remove duplicate INIT_WORK
  sis190: mdio operation failure is not correctly detected
  sis190: scheduling while atomic error

Patch
-

diff --git a/drivers/net/sis190.c b/drivers/net/sis190.c
index 7200883..c0db182 100644
--- a/drivers/net/sis190.c
+++ b/drivers/net/sis190.c
@@ -372,7 +372,7 @@ static void __mdio_cmd(void __iomem *ioaddr, u32 ctl)
msleep(1);
}
 
-   if (i  999)
+   if (i  99)
printk(KERN_ERR PFX PHY command failed !\n);
 }
 
@@ -847,10 +847,8 @@ static void sis190_soft_reset(void __iomem *ioaddr)
 {
SIS_W32(IntrControl, 0x8000);
SIS_PCI_COMMIT();
-   msleep(1);
SIS_W32(IntrControl, 0x0);
sis190_asic_down(ioaddr);
-   msleep(1);
 }
 
 static void sis190_hw_start(struct net_device *dev)
@@ -1041,8 +1039,6 @@ static int sis190_open(struct net_device *dev)
if (rc  0)
goto err_free_rx_1;
 
-   INIT_WORK(tp-phy_task, sis190_phy_task);
-
sis190_request_timer(dev);
 
rc = request_irq(dev-irq, sis190_interrupt, IRQF_SHARED, dev-name, 
dev);
@@ -1549,28 +1545,31 @@ static int __devinit 
sis190_get_mac_addr_from_eeprom(struct pci_dev *pdev,
 }
 
 /**
- * sis190_get_mac_addr_from_apc - Get MAC address for SiS965 model
+ * sis190_get_mac_addr_from_apc - Get MAC address for SiS96x model
  * @pdev: PCI device
  * @dev:  network device to get address for
  *
- * SiS965 model, use APC CMOS RAM to store MAC address.
+ * SiS96x model, use APC CMOS RAM to store MAC address.
  * APC CMOS RAM is accessed through ISA bridge.
  * MAC address is read into @net_dev-dev_addr.
  */
 static int __devinit sis190_get_mac_addr_from_apc(struct pci_dev *pdev,
  struct net_device *dev)
 {
+   static const u16 __devinitdata ids[] = { 0x0965, 0x0966, 0x0968 };
struct sis190_private *tp = netdev_priv(dev);
struct pci_dev *isa_bridge;
u8 reg, tmp8;
-   int i;
+   unsigned int i;
 
net_probe(tp, KERN_INFO %s: Read MAC address from APC.\n,
  pci_name(pdev));
 
-   isa_bridge = pci_get_device(PCI_VENDOR_ID_SI, 0x0965, NULL);
-   if (!isa_bridge)
-   isa_bridge = pci_get_device(PCI_VENDOR_ID_SI, 0x0966, NULL);
+   for (i = 0; i  ARRAY_SIZE(ids); i++) {
+   isa_bridge = pci_get_device(PCI_VENDOR_ID_SI, ids[i], NULL);
+   if (isa_bridge)
+   break;
+   }
 
if (!isa_bridge) {
net_probe(tp, KERN_INFO %s: Can not find ISA bridge.\n,
-- 
Ueimor
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] sis190: remove duplicate INIT_WORK

It is already done in sis190_init_one.

Signed-off-by: Francois Romieu [EMAIL PROTECTED]
Cc: K.M. Liu [EMAIL PROTECTED]
---
 drivers/net/sis190.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sis190.c b/drivers/net/sis190.c
index 51bbb60..f6a921c 100644
--- a/drivers/net/sis190.c
+++ b/drivers/net/sis190.c
@@ -1041,8 +1041,6 @@ static int sis190_open(struct net_device *dev)
if (rc  0)
goto err_free_rx_1;
 
-   INIT_WORK(tp-phy_task, sis190_phy_task);
-
sis190_request_timer(dev);
 
rc = request_irq(dev-irq, sis190_interrupt, IRQF_SHARED, dev-name, 
dev);
-- 
1.5.3.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] sis190: mdio operation failure is not correctly detected

i ranges from 0 to 100 in the 'for' loop a few lines above.

Reported by davem.

Signed-off-by: Francois Romieu [EMAIL PROTECTED]
Cc: K.M. Liu [EMAIL PROTECTED]
---
 drivers/net/sis190.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/sis190.c b/drivers/net/sis190.c
index f6a921c..973b369 100644
--- a/drivers/net/sis190.c
+++ b/drivers/net/sis190.c
@@ -372,7 +372,7 @@ static void __mdio_cmd(void __iomem *ioaddr, u32 ctl)
msleep(1);
}
 
-   if (i  999)
+   if (i  99)
printk(KERN_ERR PFX PHY command failed !\n);
 }
 
-- 
1.5.3.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] sis190: add cmos ram access code for the SiS19x/968 chipset pair

More work is needed to handle correctly the PHY of the new devices
when connected to a 10Mb link but this change already helps some
users as is.

Fix for:
http://bugzilla.kernel.org/show_bug.cgi?id=9467

Signed-off-by: Francois Romieu [EMAIL PROTECTED]
Cc: K.M. Liu [EMAIL PROTECTED]
Cc: J. Gleacher [EMAIL PROTECTED]
Cc: Alexandre Penasso Teixeira [EMAIL PROTECTED]
Cc: Arliton Rocha [EMAIL PROTECTED]
Cc: Juan Jose Pablos [EMAIL PROTECTED]
Cc: Wipat Srutiprom [EMAIL PROTECTED]
---
 drivers/net/sis190.c |   15 +--
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/sis190.c b/drivers/net/sis190.c
index 7200883..51bbb60 100644
--- a/drivers/net/sis190.c
+++ b/drivers/net/sis190.c
@@ -1549,28 +1549,31 @@ static int __devinit 
sis190_get_mac_addr_from_eeprom(struct pci_dev *pdev,
 }
 
 /**
- * sis190_get_mac_addr_from_apc - Get MAC address for SiS965 model
+ * sis190_get_mac_addr_from_apc - Get MAC address for SiS96x model
  * @pdev: PCI device
  * @dev:  network device to get address for
  *
- * SiS965 model, use APC CMOS RAM to store MAC address.
+ * SiS96x model, use APC CMOS RAM to store MAC address.
  * APC CMOS RAM is accessed through ISA bridge.
  * MAC address is read into @net_dev-dev_addr.
  */
 static int __devinit sis190_get_mac_addr_from_apc(struct pci_dev *pdev,
  struct net_device *dev)
 {
+   static const u16 __devinitdata ids[] = { 0x0965, 0x0966, 0x0968 };
struct sis190_private *tp = netdev_priv(dev);
struct pci_dev *isa_bridge;
u8 reg, tmp8;
-   int i;
+   unsigned int i;
 
net_probe(tp, KERN_INFO %s: Read MAC address from APC.\n,
  pci_name(pdev));
 
-   isa_bridge = pci_get_device(PCI_VENDOR_ID_SI, 0x0965, NULL);
-   if (!isa_bridge)
-   isa_bridge = pci_get_device(PCI_VENDOR_ID_SI, 0x0966, NULL);
+   for (i = 0; i  ARRAY_SIZE(ids); i++) {
+   isa_bridge = pci_get_device(PCI_VENDOR_ID_SI, ids[i], NULL);
+   if (isa_bridge)
+   break;
+   }
 
if (!isa_bridge) {
net_probe(tp, KERN_INFO %s: Can not find ISA bridge.\n,
-- 
1.5.3.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] sis190: scheduling while atomic error

sis190_tx_timeout
- sis190_hw_start
   - sis190_soft_reset
  - msleep *splat*

PCI transactions are correctly flushed here.
The msleep() is probably useless.

Signed-off-by: Francois Romieu [EMAIL PROTECTED]
Cc: K.M. Liu [EMAIL PROTECTED]
---
 drivers/net/sis190.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sis190.c b/drivers/net/sis190.c
index 973b369..c0db182 100644
--- a/drivers/net/sis190.c
+++ b/drivers/net/sis190.c
@@ -847,10 +847,8 @@ static void sis190_soft_reset(void __iomem *ioaddr)
 {
SIS_W32(IntrControl, 0x8000);
SIS_PCI_COMMIT();
-   msleep(1);
SIS_W32(IntrControl, 0x0);
sis190_asic_down(ioaddr);
-   msleep(1);
 }
 
 static void sis190_hw_start(struct net_device *dev)
-- 
1.5.3.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[IPv4] ESP: Discard dummy packets introduced in rfc4303

2007-12-10 Thread Thomas Graf

RFC4303 introduces dummy packets with a nexthdr value of 59
to implement traffic confidentiality. Such packets need to
be dropped silently and the payload may not be attempted to
be parsed as it consists of random chunk.

Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Index: net-2.6.25/net/ipv4/esp4.c
===
--- net-2.6.25.orig/net/ipv4/esp4.c 2007-12-10 15:57:23.0 +0100
+++ net-2.6.25/net/ipv4/esp4.c  2007-12-10 16:06:10.0 +0100
@@ -9,6 +9,7 @@
 #include linux/pfkeyv2.h
 #include linux/random.h
 #include linux/spinlock.h
+#include linux/in6.h
 #include net/icmp.h
 #include net/protocol.h
 #include net/udp.h
@@ -233,6 +234,10 @@
 
/* ... check padding bits here. Silly. :-) */
 
+   /* RFC4303: Drop dummy packets without any error */
+   if (nexthdr[1] == IPPROTO_NONE)
+   goto out;
+
iph = ip_hdr(skb);
ihl = iph-ihl * 4;
 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[IPv6] ESP: Discard dummy packets introduced in rfc4303

2007-12-10 Thread Thomas Graf

RFC4303 introduces dummy packets with a nexthdr value of 59
to implement traffic confidentiality. Such packets need to
be dropped silently and the payload may not be attempted to
be parsed as it consists of random chunk.

Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Index: net-2.6.25/net/ipv6/esp6.c
===
--- net-2.6.25.orig/net/ipv6/esp6.c 2007-12-10 16:06:02.0 +0100
+++ net-2.6.25/net/ipv6/esp6.c  2007-12-10 16:08:02.0 +0100
@@ -238,6 +238,12 @@
}
/* ... check padding bits here. Silly. :-) */
 
+   /* RFC4303: Drop dummy packets without any error */
+   if (nexthdr[1] == IPPROTO_NONE) {
+   ret = -EINVAL;
+   goto out;
+   }
+
pskb_trim(skb, skb-len - alen - padlen - 2);
ret = nexthdr[1];
}
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [TCP]: Bind fackets_out state to highest_sack more tightly

From: Ilpo_Järvinen [EMAIL PROTECTED]
Date: Mon, 10 Dec 2007 14:39:46 +0200 (EET)

 On Mon, 10 Dec 2007, David Miller wrote:

  From: Ilpo_Järvinen [EMAIL PROTECTED]
  Date: Mon, 10 Dec 2007 14:27:24 +0200 (EET)

   Added checks will catch most of the errors if the current
   complex fack_count counting logic is flawed somewhere.

   Fackets_out should always be advancable if highest_sack is too
   because the fackets_out is nowadays accurate (and obviously it
   must be smaller than packets_out).

   Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]

  Applied to net-2.6.25, thanks!

 Please get the fack_count fix as well from the mm1 thread before my 
 mailbox gets filled with stacktraces :-) :
   http://marc.info/?l=linux-netdevm=119728952018975w=2

Done, thanks!
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH][Take3] PCI legacy I/O port free driver - Making Intel e1000 driver legacy I/O port free

2007-12-10 Thread Kok, Auke

Tomohiro Kusumi wrote:
 Dear Auke and e1000 maintainers
 
 Hi, this is the patch which makes the e1000 driver legacy I/O port free.
 
 I've received some advice from Auke quite long time ago, and submitted
 a patch (http://lkml.org/lkml/2007/8/10/11) which I think meets what Auke
 had told me. Since the patch has not received any reaction from the e1000
 community, let me submit it once again (plus, the previous one had a bug
 regarding module parameter).


this opens up an interesting discussion -

e1000 is going to be the driver for 8254x hardware only from 2.6.25 and on. 
e1000e
will be the driver that powers 8257x hardware (and ich8/9 and es2lan NICs) and
those are all the pci-e hardware devices.

This means that the current e1000 driver will not power the pci-e hardware 
anymore
and thus those io-port free devices are removed from e1000.

considering the fact that only 82542, 82543 and 82547 devices are (from 2.6.25) 
on
the only devices that can be ioport free in this new e1000 driver, I think that 
it
almost makes no sense to code this functionality up for that.

so, I'm wondering if we should not drop this effort alltogether, since it's 
just a
lot of code and none of the pci-e hardware should use ioport anymore.

Can you screen e1000e in jeff garzik's netdev-2.6#upstream tree and see if that 
is
correctly not using ioport? I think that would be worth the time.

Cheers,

Auke



 
 If the module parameter enable_legacy_ioport_free is set to 0, it does
 not differ from the existing e1000 driver, otherwise legacy I/O port free
 function is enabled. I may have done something wrong, so any comments
 would be helpful.
 
 Tomohiro Kusumi
 Signed-off-by: Tomohiro Kusumi [EMAIL PROTECTED]
 
 ---
 diff -Nur linux-2.6.23.org/drivers/net/e1000/e1000.h 
 linux-2.6.23/drivers/net/e1000/e1000.h
 --- linux-2.6.23.org/drivers/net/e1000/e1000.h2007-10-16 
 11:30:37.0 +0900
 +++ linux-2.6.23/drivers/net/e1000/e1000.h2007-10-16 11:32:55.0 
 +0900
 @@ -342,6 +342,9 @@
   boolean_t quad_port_a;
   unsigned long flags;
   uint32_t eeprom_wol;
 +
 + int use_ioport;
 + int bars;
  };
 
  enum e1000_state_t {
 diff -Nur linux-2.6.23.org/drivers/net/e1000/e1000_main.c 
 linux-2.6.23/drivers/net/e1000/e1000_main.c
 --- linux-2.6.23.org/drivers/net/e1000/e1000_main.c   2007-10-16 
 11:30:38.0 +0900
 +++ linux-2.6.23/drivers/net/e1000/e1000_main.c   2007-10-16 
 14:48:16.390575464 +0900
 @@ -226,6 +226,11 @@
  static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev);
  static void e1000_io_resume(struct pci_dev *pdev);
 
 +static unsigned int enable_legacy_ioport_free = 0;
 +module_param(enable_legacy_ioport_free, uint, 0644);
 +MODULE_PARM_DESC(enable_legacy_ioport_free, Enable legacy I/O port free 
 (default:0));
 +static int e1000_test_legacy_ioport(struct pci_dev *pdev);
 +
  static struct pci_error_handlers e1000_err_handler = {
   .error_detected = e1000_io_error_detected,
   .slot_reset = e1000_io_slot_reset,
 @@ -872,8 +877,24 @@
   int i, err, pci_using_dac;
   uint16_t eeprom_data = 0;
   uint16_t eeprom_apme_mask = E1000_EEPROM_APME;
 - if ((err = pci_enable_device(pdev)))
 + int bars = 0;
 + int use_ioport = 0;
 +
 + if (enable_legacy_ioport_free) {
 + if ((use_ioport = e1000_test_legacy_ioport(pdev))  0) {
 + E1000_ERR(e1000_test_legacy_ioport failed, 
 aborting\n);
 + return -1;
 + }
 + if (use_ioport)
 + bars = pci_select_bars(pdev, IORESOURCE_MEM | 
 IORESOURCE_IO);
 + else
 + bars = pci_select_bars(pdev, IORESOURCE_MEM);
 + if ((err = pci_enable_device_bars(pdev, bars)) != 0)
 + return err;
 + }
 + else if ((err = pci_enable_device(pdev)) != 0) {
   return err;
 + }
 
   if (!(err = pci_set_dma_mask(pdev, DMA_64BIT_MASK)) 
   !(err = pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK))) {
 @@ -887,7 +908,11 @@
   pci_using_dac = 0;
   }
 
 - if ((err = pci_request_regions(pdev, e1000_driver_name)))
 + if (enable_legacy_ioport_free)
 + err = pci_request_selected_regions(pdev, bars, 
 e1000_driver_name);
 + else
 + err = pci_request_regions(pdev, e1000_driver_name);
 + if (err)
   goto err_pci_reg;
 
   pci_set_master(pdev);
 @@ -906,6 +931,10 @@
   adapter-pdev = pdev;
   adapter-hw.back = adapter;
   adapter-msg_enable = (1  debug) - 1;
 + if (enable_legacy_ioport_free) {
 + adapter-use_ioport = use_ioport;
 + adapter-bars = bars;
 + }
 
   mmio_start = pci_resource_start(pdev, BAR_0);
   mmio_len = pci_resource_len(pdev, BAR_0);
 @@ -915,12 +944,14 @@
   if (!adapter-hw.hw_addr)
   goto err_ioremap;
 
 - for (i = BAR_1; i = BAR_5; i++) {
 - if

Re: [PATCH 2/3] arch/ : Platform changes for UCC TDM driver for MPC8323ERDB.Also includes related QE changes.

2007-12-10 Thread Stephen Rothwell

On Mon, 10 Dec 2007 17:39:22 +0530 (IST) Poonam_Aggrwal-b10812 [EMAIL 
PROTECTED] wrote:

 +++ b/arch/powerpc/sysdev/qe_lib/qe.c
 @@ -149,22 +149,116 @@ EXPORT_SYMBOL(qe_issue_cmd);
   */
  static unsigned int brg_clk = 0;
  
 -unsigned int get_brg_clk(void)
 +u32 get_brg_clk(enum qe_clock brgclk, enum qe_clock *brg_source)
  {
 - struct device_node *qe;
 - if (brg_clk)
 - return brg_clk;
 + struct device_node *qe, *brg, *clocks;
 + enum qe_clock brg_src;
 + u32 brg_input_freq = 0;
 + u32 brg_num;
 + const unsigned int *prop;
  
 - qe = of_find_node_by_type(NULL, qe);
 - if (qe) {
 + *brg_source = 0;
 +
 + brg_num = brgclk - QE_BRG1;
 + brg = of_find_compatible_node(NULL, NULL, fsl,cpm-brg);
 + if (brg) {
   unsigned int size;
 - const u32 *prop = of_get_property(qe, brg-frequency, size);
 - brg_clk = *prop;
 - of_node_put(qe);
 - };
 + prop = of_get_property(brg,
 + fsl,brg-sources, size);
 +
 + brg_src = *(prop + brg_num);

You should probably sanity check that prop is not NULL and points to
something large enough.

You don't use brg after here, so the of_node_put(brg) could go here to
save putting it in multiple places later.  Also, currently there are
paths through the following code that do not do the of_node_put(brg).

 + if (brg_src == 0) {
 + *brg_source = 0;
 + if (brg_clk  0) {
 + of_node_put(brg);
 + return brg_clk;
 + }
 + qe = of_find_node_by_type(NULL, qe);
 + if (qe) {
 + unsigned int size;
 + prop = of_get_property
 + (qe, brg-frequency, size);
 + of_node_put(qe);
 + of_node_put(brg);
 + return *prop;

NULL check here (yes, I know that the old code didn't check).

 + }
 + } else {
 + *brg_source = brg_src + QE_CLK1 - 1;
 + clocks = of_find_compatible_node(NULL, NULL,
 + fsl,cpm-clocks);
 + prop = of_get_property(clocks,
 + #clock-cells, size);
 + /*
 +  * clock-cells = 1 only supported right now.
 +  */
 + if (*prop != 1)

Again check for NULL (and possibly size).

 + return 0;
 + prop = of_get_property(clocks,
 + clock-frequency, size);
 +
 + brg_input_freq = *(prop+(brg_src - 1));

And again.

 + of_node_put(clocks);
 + of_node_put(brg);
 + return brg_input_freq;
 + }
 + }
   return brg_clk;
  }
-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgprvL9K6FPZV.pgp
Description: PGP signature

Re: [patch 0/5] ipv6: make af_inet6 subsystems to return an error at init

From: Daniel Lezcano [EMAIL PROTECTED]
Date: Mon, 10 Dec 2007 16:32:50 +0100

 I just noticed that I forgot to put ipv6 under bracket.
 Sorry for that :(
 
 Should I resend the patchset ?

This is not necessary.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [IPv4] ESP: Discard dummy packets introduced in rfc4303

From: Thomas Graf [EMAIL PROTECTED]
Date: Mon, 10 Dec 2007 23:17:03 +0100

 RFC4303 introduces dummy packets with a nexthdr value of 59
 to implement traffic confidentiality. Such packets need to
 be dropped silently and the payload may not be attempted to
 be parsed as it consists of random chunk.

 Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Applied to net-2.6, since this is more of a bug fix than
anything else.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [IPv6] ESP: Discard dummy packets introduced in rfc4303

From: Thomas Graf [EMAIL PROTECTED]
Date: Mon, 10 Dec 2007 23:18:07 +0100

 RFC4303 introduces dummy packets with a nexthdr value of 59
 to implement traffic confidentiality. Such packets need to
 be dropped silently and the payload may not be attempted to
 be parsed as it consists of random chunk.

 Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Also applied to net-2.6, thanks Thomas.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3][BNX2]: Add PHY_DIS_EARLY_DAC workaround.

From: Michael Chan [EMAIL PROTECTED]
Date: Sun, 09 Dec 2007 13:16:48 -0800

 [BNX2]: Add PHY_DIS_EARLY_DAC workaround.
 
 5709 Ax and Bx chips all need this workaround.
 
 Signed-off-by: Michael Chan [EMAIL PROTECTED]

Applied.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3][BNX2]: Fix RX packet rot.

From: Michael Chan [EMAIL PROTECTED]
Date: Sun, 09 Dec 2007 13:17:14 -0800

 [BNX2]: Fix RX packet rot.

 Packets can be left in the RX ring if the NAPI budget is reached.
 This is caused by storing the latest rx index at the beginning of
 bnx2_rx_int().  We may not process all the work up to this index
 if the budget is reached and so some packets in the RX ring may rot
 when we later check for more work using this stored rx index.

 The fix is to not store this latest hw index and only store the
 processed rx index.  We use a new function bnx2_get_hw_rx_cons()
 to fetch the latest hw rx index.

 Signed-off-by: Michael Chan [EMAIL PROTECTED]

Applied.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3][BNX2]: Update version to 1.6.9.

From: Michael Chan [EMAIL PROTECTED]
Date: Sun, 09 Dec 2007 13:18:02 -0800

 [BNX2]: Update version to 1.6.9.
 
 Signed-off-by: Michael Chan [EMAIL PROTECTED]

Applied.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/4] udp: memory accounting in IPv4

2007-12-10 Thread Hideo AOKI

Herbert Xu wrote:
 On Wed, Dec 05, 2007 at 11:28:34PM -0500, Hideo AOKI wrote:
 1. Using sk_forward_alloc and adding socket lock

UDP already uses a socket lock to send message. However, it doesn't
use the lock to receive message. I wonder if we can also use the
lock when sk_forward_alloc is updated in receive processing.
I understand performance issue might occur, but ...
 
 Having discussed this with Dave we've agreed that this is the
 best way to go.
 
 Thanks,

Hello,

Thank you so much for reviewing.

I chose this solution and developed new patch set.
I'm testing the patch set right now.
I'll submit it to netdev as soon as I finish the test.

Best regards,
Hideo

-- 
Hitachi Computer Products (America) Inc.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[IPSEC]: Add xfrm_input_state helper

Hi Dave:

This is the last patch we need before converting ESP over to
crypto_aead.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
[IPSEC]: Add xfrm_input_state helper

This patch adds the xfrm_input_state helper function which returns the
current xfrm state being processed on the input path given an sk_buff.
This is currently only used by xfrm_input but will be used by ESP upon
asynchronous resumption.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index fb154a6..c49fe0f 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1302,4 +1302,9 @@ static inline void xfrm_states_delete(struct xfrm_state 
**states, int n)
 }
 #endif
 
+static inline struct xfrm_state *xfrm_input_state(struct sk_buff *skb)
+{
+   return skb-sp-xvec[skb-sp-len - 1];
+}
+
 #endif /* _NET_XFRM_H */
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 8b2b1b5..8624cbd 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -109,7 +109,7 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 
spi, int encap_type)
/* A negative encap_type indicates async resumption. */
if (encap_type  0) {
async = 1;
-   x = skb-sp-xvec[skb-sp-len - 1];
+   x = xfrm_input_state(skb);
seq = XFRM_SKB_CB(skb)-seq;
goto resume;
}
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.25] netns: struct net content re-work

2007-12-10 Thread Eric W. Biederman


The idea of separate structures make sense, and seems needed and useful.

Denis V. Lunev [EMAIL PROTECTED] writes:

 diff --git a/include/net/netns/unix.h b/include/net/netns/unix.h
 new file mode 100644
 index 000..27b4e7f
 --- /dev/null
 +++ b/include/net/netns/unix.h
   ^^
Given that we are making this per protocol adding a separate directory
to hold them seems to be the wrong grouping.  Ideally we want everything
for the protocol all together in the same location so it is easy
to find.  Possibly with a user/kernel split.

So perhaps unix_net.h

 @@ -0,0 +1,13 @@
 +/*
 + * Unix network namespace
 + */
 +#ifndef __NETNS_UNIX_H__
 +#define __NETNS_UNIX_H__
 +
 +struct ctl_table_header;
 +struct netns_unix {
 + int sysctl_unix_max_dgram_qlen;
 + struct ctl_table_header *unix_ctl;
 +};

How about struct unix_net?  I think that tracks a little better
with how we have done struct in_device, ip6_dev and their friends.

Eric
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] NET : dst_ifdown() cleanup

2007-12-10 Thread Eric Dumazet


This cleanup shrinks size of net/core/dst.o on i386 from 1299 to 1289 bytes.
(This is because dev_hold()/dev_put() are doing atomic_inc()/atomic_dec() and 
force compiler to re-evaluate memory contents.)


Signed-off-by: Eric Dumazet [EMAIL PROTECTED]

diff --git a/net/core/dst.c b/net/core/dst.c
index 5c6cfc4..7eceeba 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -284,8 +284,8 @@ static inline void dst_ifdown(struct dst_entry *dst, struct 
net_device *dev,
dev_put(dev);
if (dst-neighbour  dst-neighbour-dev == dev) {
dst-neighbour-dev = dst-dev;
+   dev_hold(dst-dev);
dev_put(dev);
-   dev_hold(dst-neighbour-dev);
}
}
 }

Re: [PATCH 2.6.25] netns: struct net content re-work

2007-12-10 Thread Eric W. Biederman

Kirill Korotaev [EMAIL PROTECTED] writes:

 Daniel Lezcano wrote:
 Denis V. Lunev wrote:
 
Recently David Miller and Herbert Xu pointed out that struct net becomes
overbloated and un-maintainable. There are two solutions:
- provide a pointer to a network subsystem definition from struct net.
  This costs an additional dereferrence
- place sub-system definition into the structure itself. This will speedup
  run-time access at the cost of recompilation time

The second approach looks better for us. 
 
 
 Yes, we do not need/want a pointer in this structure and add more 
 dereference in the network code.

If it does go that way we just carefully pass around a properly
typed structure in that subsystem to reduce the cost.  Still 
it would be nice not to need to add the extra pointer.

index b62e31f..f60e1ce 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -8,6 +8,8 @@
 #include linux/workqueue.h
 #include linux/list.h

+#include net/netns/unix.h
+
 struct proc_dir_entry;
 struct net_device;
 struct sock;
@@ -46,8 +48,7 @@ struct net {
 struct hlist_head   packet_sklist;

 /* unix sockets */
-int sysctl_unix_max_dgram_qlen;
-struct ctl_table_header *unix_ctl;
+struct netns_unix   unx;
 
 
 Can you change this from unx to unix ?

 no, it won't compile. Guess why :)

Hmm.  It looks like it is a #define somewhere gcc?
Eric
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.25] netns: struct net content re-work

From: [EMAIL PROTECTED] (Eric W. Biederman)
Date: Mon, 10 Dec 2007 21:04:07 -0700

 Kirill Korotaev [EMAIL PROTECTED] writes:

  Daniel Lezcano wrote:
  Denis V. Lunev wrote:
  Can you change this from unx to unix ?

  no, it won't compile. Guess why :)

 Hmm.  It looks like it is a #define somewhere gcc?

It is a platform CPP pre-define for UNIX platforms.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC 0/3] Add AEAD support to ESP

Hi Dave:

This series of patches add AEAD support to ESP.

Please don't merge it just yet because they depend on what's
in the current cryptodev-2.6 tree.  Once that tree has settled
down I'll ask you to pull it and then these patches can go on
top of that.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] [IPSEC]: Allow async algorithms

[IPSEC]: Allow async algorithms

Now that ESP uses authenc we can turn on the support for async algorithms
in IPsec.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/xfrm/xfrm_algo.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/xfrm/xfrm_algo.c b/net/xfrm/xfrm_algo.c
index 1686f64..ae34a12 100644
--- a/net/xfrm/xfrm_algo.c
+++ b/net/xfrm/xfrm_algo.c
@@ -358,21 +358,21 @@ static const struct xfrm_algo_list xfrm_aalg_list = {
.algs = aalg_list,
.entries = ARRAY_SIZE(aalg_list),
.type = CRYPTO_ALG_TYPE_HASH,
-   .mask = CRYPTO_ALG_TYPE_HASH_MASK | CRYPTO_ALG_ASYNC,
+   .mask = CRYPTO_ALG_TYPE_HASH_MASK,
 };
 
 static const struct xfrm_algo_list xfrm_ealg_list = {
.algs = ealg_list,
.entries = ARRAY_SIZE(ealg_list),
.type = CRYPTO_ALG_TYPE_BLKCIPHER,
-   .mask = CRYPTO_ALG_TYPE_MASK | CRYPTO_ALG_ASYNC,
+   .mask = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
 };
 
 static const struct xfrm_algo_list xfrm_calg_list = {
.algs = calg_list,
.entries = ARRAY_SIZE(calg_list),
.type = CRYPTO_ALG_TYPE_COMPRESS,
-   .mask = CRYPTO_ALG_TYPE_MASK | CRYPTO_ALG_ASYNC,
+   .mask = CRYPTO_ALG_TYPE_MASK,
 };
 
 static struct xfrm_algo_desc *xfrm_find_algo(
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] [IPSEC]: Use crypto_aead and authenc in ESP

[IPSEC]: Use crypto_aead and authenc in ESP

This patch converts ESP to use the crypto_aead interface and in particular
the authenc algorithm.  This lays the foundations for future support of
combined mode algorithms.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/esp.h |   54 +-
 net/ipv4/esp4.c   |  465 --
 net/ipv6/esp6.c   |  422 +
 3 files changed, 536 insertions(+), 405 deletions(-)

diff --git a/include/net/esp.h b/include/net/esp.h
index c05f529..d9834f7 100644
--- a/include/net/esp.h
+++ b/include/net/esp.h
@@ -1,58 +1,22 @@
 #ifndef _NET_ESP_H
 #define _NET_ESP_H
 
-#include linux/crypto.h
-#include net/xfrm.h
-#include linux/scatterlist.h
+#include linux/skbuff.h
 
 #define ESP_NUM_FAST_SG4
 
-struct esp_data
-{
-   struct scatterlist  sgbuf[ESP_NUM_FAST_SG];
-
-   /* Confidentiality */
-   struct {
-   int padlen; /* 0..255 */
-   /* ivlen is offset from enc_data, where encrypted data start.
-* It is logically different of crypto_tfm_alg_ivsize(tfm).
-* We assume that it is either zero (no ivec), or
-* = crypto_tfm_alg_ivsize(tfm). */
-   int ivlen;
-   int ivinitted;
-   u8  *ivec;  /* ivec buffer */
-   struct crypto_blkcipher *tfm;   /* crypto handle */
-   } conf;
-
-   /* Integrity. It is active when icv_full_len != 0 */
-   struct {
-   u8  *work_icv;
-   int icv_full_len;
-   int icv_trunc_len;
-   struct crypto_hash  *tfm;
-   } auth;
+struct crypto_aead;
+
+struct esp_data {
+   /* 0..255 */
+   int padlen;
+
+   /* Confidentiality  Integrity */
+   struct crypto_aead *aead;
 };
 
 extern void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len);
 
-static inline int esp_mac_digest(struct esp_data *esp, struct sk_buff *skb,
-int offset, int len)
-{
-   struct hash_desc desc;
-   int err;
-
-   desc.tfm = esp-auth.tfm;
-   desc.flags = 0;
-
-   err = crypto_hash_init(desc);
-   if (unlikely(err))
-   return err;
-   err = skb_icv_walk(skb, desc, offset, len, crypto_hash_update);
-   if (unlikely(err))
-   return err;
-   return crypto_hash_final(desc, esp-auth.work_icv);
-}
-
 struct ip_esp_hdr;
 
 static inline struct ip_esp_hdr *ip_esp_hdr(const struct sk_buff *skb)
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index de4592c..c1f5936 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -1,3 +1,5 @@
+#include crypto/aead.h
+#include crypto/authenc.h
 #include linux/err.h
 #include linux/module.h
 #include net/ip.h
@@ -7,20 +9,91 @@
 #include linux/crypto.h
 #include linux/kernel.h
 #include linux/pfkeyv2.h
-#include linux/random.h
+#include linux/rtnetlink.h
 #include linux/spinlock.h
 #include net/icmp.h
 #include net/protocol.h
 #include net/udp.h
 
+struct esp_skb_cb {
+   struct xfrm_skb_cb xfrm;
+   void *tmp;
+};
+
+#define ESP_SKB_CB(__skb) ((struct esp_skb_cb *)((__skb)-cb[0]))
+
+/*
+ * Allocate an AEAD request structure with extra space for SG and IV.
+ *
+ * For alignment considerations the IV is placed at the front, followed
+ * by the request and finally the SG list.
+ *
+ * TODO: Use spare space in skb for this where possible.
+ */
+static void *esp_alloc_tmp(struct crypto_aead *aead, int nfrags)
+{
+   unsigned int len;
+
+   len = crypto_aead_ivsize(aead);
+   if (len) {
+   len += crypto_aead_alignmask(aead) 
+  ~(crypto_tfm_ctx_alignment() - 1);
+   len = ALIGN(len, crypto_tfm_ctx_alignment());
+   }
+
+   len += sizeof(struct aead_givcrypt_request) + crypto_aead_reqsize(aead);
+   len = ALIGN(len, __alignof__(struct scatterlist));
+
+   len += sizeof(struct scatterlist *) * nfrags;
+
+   return kmalloc(len, GFP_ATOMIC);
+}
+
+static inline u8 *esp_tmp_iv(struct crypto_aead *aead, void *tmp)
+{
+   return crypto_aead_ivsize(aead) ? 
+  PTR_ALIGN((u8 *)tmp, crypto_aead_alignmask(aead) + 1) : tmp;
+}
+
+static inline struct aead_givcrypt_request *esp_tmp_req(
+   struct crypto_aead *aead, u8 *iv)
+{
+   struct aead_givcrypt_request *req;
+
+   req = (void *)PTR_ALIGN(iv + crypto_aead_ivsize(aead),
+   crypto_tfm_ctx_alignment());
+   aead_givcrypt_set_tfm(req, aead);
+   return req;
+}
+
+static inline struct scatterlist *esp_tmp_sg(struct crypto_aead *aead,
+struct aead_givcrypt_request *req)
+{
+   return (void *)ALIGN((unsigned long)(req + 1) +
+

[PATCH 3/3] [IPSEC]: Add support for combined mode algorithms