Re: [patch 3/5] ATM: he: fix section mismatch

2006-09-20 Thread Roland Dreier
  WARNING: drivers/atm/he.o - Section mismatch: reference to .init.text: from 
  .text between 'he_start' (at offset 0x218a) and 'he_service_tbrq'

  -static int __init
  +static int __devinit
   he_init_group(struct he_dev *he_dev, int group)

There are a ton of other __init functions (he_init_irq(),
he_init_tx_lbfp(), etc.) called from the __devinit function he_start()
in this driver.  So I think this patch is insufficient -- does it even
really fix the warning??  (I think -funit-at-a-time hides the warning
with this patch on x86-64)

Anyway, I sent a more comprehensive fix to Chas who forwarded it to
DaveM already.

 - R.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] genetlink custom attribute type

2006-09-20 Thread Thomas Graf
* Johannes Berg [EMAIL PROTECTED] 2006-09-14 11:21
 On Thu, 2006-09-14 at 10:14 +0200, Thomas Graf wrote:
 
  Looks good, we have to watch the size of struct nla_policy though.
  This bumps the size from 4 bytes to 16 bytes on 64bit architectures
  which might become a problem since we always use ATTR_MAX sized
  arrays.
 
 Yes, I'm aware of that, but I couldn't think of a way to handle it well.
 
 I thought about using a second array containing just the check
 functions, and then (ab)using `len' to index into it but that didn't
 seem clean enough.

I agree, I talked about this with various people and some ideas
came up.

Always use function pointers to define the validation policy, i.e.
there would be nla_validate_u32() etc. The problem with this approach
is that for every string attribute with different length a separate
validation function is required which simply adds to code what it
saved from text. It also makes exporting the policy to userspace
impossible.

Following up on your idea, we could save a bit by using ERR_PTR() to
store both the callback pointer and type/len tuple in the very same
pointer but that gets very ugly.

I think its best to use your patch for now and see where this leads to.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] cfg80211 thoughts on configuration

2006-09-20 Thread Thomas Graf
* Johannes Berg [EMAIL PROTECTED] 2006-09-14 12:53
 This is some preliminary code how I'm currently thinking (and that might
 change radically :) ) configuration might look like.
 
 It uses the patch I previously posted to make genetlink attributes
 custom-definable.
 
 --- wireless-dev.orig/include/linux/nl80211.h 2006-09-13 22:06:10.539647141 
 +0200
 +++ wireless-dev/include/linux/nl80211.h  2006-09-13 22:06:11.919647141 
 +0200
 @@ -45,6 +45,47 @@ enum {
   /* get list of all interfaces belonging to a wiphy */
   NL80211_CMD_NEW_INTERFACES,
  
 + /* configure device */
 + NL80211_CMD_CONFIGURE,
 +
 + /* request configuration */
 + NL80211_CMD_GET_CONFIG,
 +
 + /* configuration sent from kernel */
 + NL80211_CMD_CONFIGURATION,

I think I brought this up already, it's a lot easier to understand
things if you keep it symmetric, i.e. NL80211_CMD_GET_CONFIG triggers
sending a NL80211_CMD_NEW_CONFIG.

 +static int check_information_element(struct nlattr *nla)
 +{
 + int len = nla_len(nla);
 + u8 *data = nla_data(nla);
 + int elementlen;
 +
 + while (len = 2) {
 + /* 1 byte ID, 1 byte len, `len' bytes data */
 + elementlen = *(data+1) + 2;
 + data += elementlen;
 + len -= elementlen;
 + }
 + return len ? -EINVAL : 0;
 +}
 +
  static struct nla_policy nl80211_policy[NL80211_ATTR_MAX+1] __read_mostly = {
   [NL80211_ATTR_IFINDEX] = { .type = NLA_U32 },
   [NL80211_ATTR_WIPHY] = { .type = NLA_U32 },
 @@ -33,6 +49,17 @@ static struct nla_policy nl80211_policy[
.len = NL80211_MAX_FRAME_LEN },
   [NL80211_ATTR_IFNAME] = { .type = NLA_NUL_STRING, .len = IFNAMSIZ-1 },
   [NL80211_ATTR_IFTYPE] = { .type = NLA_U32 },
 + [NL80211_ATTR_NETWORK_ID] = { .type = NLA_U16 },
 + [NL80211_ATTR_CHANNEL] = { .type = NLA_U32 },
 + [NL80211_ATTR_RX_SENSITIVITY] = { .type = NLA_U32 },
 + [NL80211_ATTR_BSSID] = { .len = 6 },
 + [NL80211_ATTR_SSID] = { .type = NLA_STRING, .len = 32 },
 + [NL80211_ATTR_TRANSMIT_POWER] = { .type = NLA_U32 },
 + [NL80211_ATTR_FRAG_THRESHOLD] = { .type = NLA_U32 },
 + [NL80211_ATTR_INFORMATION_ELEMENT] = { .type = NLA_CUSTOM_CHECK,
 +.check = 
 check_information_element },

Just use a nested attribute here, this new array format you introduce
having 1 byte ID, 1 byte len is equivalent to using a set of nested
attributes with nla_type=id, nla_len=len.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make some netfilter globals __read_mostly

2006-09-20 Thread Patrick McHardy
Brian Haley wrote:
 Make some netfilter globals __read_mostly at the request of Patrick
 McHardy.
 
 Signed-off-by: Brian Haley [EMAIL PROTECTED]

Applied, thanks Brian.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 4/5] bluetooth: small cleanups

2006-09-20 Thread Marcel Holtmann
Hi Andrew,

 From: Pavel Machek [EMAIL PROTECTED]
 
 This cleans up bluetooth a bit, no code changes.

NAK on this one. These cleanups will come with my other pending
cleanups.

Regards

Marcel


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] cfg80211 thoughts on configuration

2006-09-20 Thread Johannes Berg
On Wed, 2006-09-20 at 08:33 +0200, Thomas Graf wrote:

 I think I brought this up already, it's a lot easier to understand
 things if you keep it symmetric, i.e. NL80211_CMD_GET_CONFIG triggers
 sending a NL80211_CMD_NEW_CONFIG.

Yes, I think you did :) I'll do that as soon as I get around to
reworking it (hoping for more comments...)

 Just use a nested attribute here, this new array format you introduce
 having 1 byte ID, 1 byte len is equivalent to using a set of nested
 attributes with nla_type=id, nla_len=len.

No, it is only validated, it is then supposed to be copied verbatim into
some 802.11 frames. I thought validating it would be a good idea to not
send out totally bogus frames, but I didn't want to have to mangle it in
the kernel.

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] cfg80211 thoughts on configuration

2006-09-20 Thread Thomas Graf
* Johannes Berg [EMAIL PROTECTED] 2006-09-20 09:03
  Just use a nested attribute here, this new array format you introduce
  having 1 byte ID, 1 byte len is equivalent to using a set of nested
  attributes with nla_type=id, nla_len=len.
 
 No, it is only validated, it is then supposed to be copied verbatim into
 some 802.11 frames. I thought validating it would be a good idea to not
 send out totally bogus frames, but I didn't want to have to mangle it in
 the kernel.

I see, fair enough, wasn't able to get that from your code.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


UDP Out 0f Sequence

2006-09-20 Thread Majumder, Rajib
Hi,

If I write UDP datagrams 1,2 and 3 to network and if the receiver receives in 
order 2,1, and 3, where can the sequence get changed? Is it at the source 
stack, network transit or destination stack? 

Any reply is highly appreciated. 

Thanks

Rajib 

==
Please access the attached hyperlink for an important electronic communications 
disclaimer: 

http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Suspend/Resume: IPv6 default route gets lost

2006-09-20 Thread Pekka Savola

Hello,

I'm using FC5 w/ 2.6.17-1.2174_FC5.  When the laptop resumes from 
suspend, it does not re-send an IPv6 route solicitation.  So, if the 
IPv6 default route expired while you were in suspend, you'll have to 
wait for the next multicast unsolicited RA.


A workaround is to cycle the interface at ACPI resume scripts.

Maybe triggering a RS is a missing feature in suspend/resume kernel 
functionality?


There has also been discussion (years ago) about user-space interface 
to triggering a RS, but AFAIK, none exists right now.


--
Pekka Savola You each name yourselves king, yet the
Netcore Oykingdom bleeds.
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcp: simpler bic default

2006-09-20 Thread bert hubert
On Tue, Sep 19, 2006 at 04:23:55PM -0700, Stephen Hemminger wrote:
 Okay, build testing all the possibilities now, answer by morning..

Please boot some of them as well - I can see a kernel that really wants to
load bic at boot time but can't find it.

Bert

-- 
http://www.PowerDNS.com  Open source, database driven DNS Software 
http://netherlabs.nl  Open and Closed source services
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/4] Make sure ip_vs_ftp ports are valid

2006-09-20 Thread Patrick McHardy
Horms wrote:
 Here is the revised patch.
 

 [IPVS] Make sure ip_vs_ftp ports are valid

 I'm not entirely sure what happens in the case of a valid port,
 at best it'll be silently ignored. This patch ensures that
 the port values are unsigned short values, and thus always valid.

 Cc: Patrick McHardy [EMAIL PROTECTED]
 Signed-Off-By: Simon Horman [EMAIL PROTECTED]

 Index: linux-2.6/net/ipv4/ipvs/ip_vs_ftp.c
 ===
 --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ftp.c  2006-09-04
10:47:09.0 +0900
 +++ linux-2.6/net/ipv4/ipvs/ip_vs_ftp.c   2006-09-04 10:59:30.0
+0900
 @@ -44,8 +44,8 @@
   * List of ports (up to IP_VS_APP_MAX_PORTS) to be handled by helper
   * First port is set to the default port.
   */
 -static int ports[IP_VS_APP_MAX_PORTS] = {21, 0};
 -module_param_array(ports, int, NULL, 0);
 +static unsigned short ports[IP_VS_APP_MAX_PORTS] = {21, 0};
 +module_param_array(ports, ushort, NULL, 0);
  MODULE_PARM_DESC(ports, Ports to monitor for FTP control commands);

  /*

It looks like the wrong patch went in:

http://marc.theaimsgroup.com/?l=git-commits-headm=115862407021941w=2

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


source ip selection for local route

2006-09-20 Thread Kimitoshi Takahashi
Hello,

I have a question regarding the subject.
In the old threads it seemed to have concluded that ignoring preferred source 
in local route was a
bug and a pacth was proposed.

See,
http://marc.theaimsgroup.com/?l=linux-netdevm=99985580920599w=2

There, the following patch is proposed,
--- net/ipv4/route.c.x  Fri Sep  7 02:30:54 2001
+++ net/ipv4/route.cFri Sep  7 02:30:54 2001
@@ -1795,14 +1795,13 @@

if (res.type == RTN_LOCAL) {
if (!key.src)
-   key.src = key.dst;
+   key.src = res.fi-fib_prefsrc ? : key.dst;
if (dev_out)
dev_put(dev_out);
dev_out = loopback_dev;
dev_hold(dev_out);
key.oif = dev_out-ifindex;
-   if (res.fi)
-   fib_info_put(res.fi);
+   fib_info_put(res.fi);
res.fi = NULL;
flags |= RTCF_LOCAL;
goto make_route;

However, in the relatively recent kernel (2.6.17.9) it seems that the patch 
hasn't been applied.

net/ipv4/route.c
   2508 if (res.type == RTN_LOCAL) {
   2509 if (!fl.fl4_src)
   2510 fl.fl4_src = fl.fl4_dst;
   2511 if (dev_out)
   2512 dev_put(dev_out);
   2513 dev_out = loopback_dev;
   2514 dev_hold(dev_out);
   2515 fl.oif = dev_out-ifindex;
   2516 if (res.fi)
   2517 fib_info_put(res.fi);
   2518 res.fi = NULL;
   2519 flags |= RTCF_LOCAL;
   2520 goto make_route;
   2521 }

And actually the source IP of the comunication between two local interfaces is 
always that of
destination.

So the questions is why the patch hasn't been applied to the main line kernel, 
although deciding the
souce IP for local route based on routing table seemed more natural than 
makeing it IP of the
destination.


The actual problem I have is the following, (if you are interested,)
We have a nfs server whose IP address is shared among two hosts using vrrp and 
those two hosts also
act as nfs clients.

When the following host1 is a nfs server, the two hosts have following IPs.
host1  IP1, VRIP(nfs server's IP shared using vrrp)
host2  IP2

The nfs server and clients IP becomes as follows, because on host1 the source 
IP of nfs packet
becomes that of destination i.e. VRIP.
   nfs server IP   nfs client IP
on host1   VRIPVRIP
on host2   VRIPIP2

We also share the content of the rmtab which is something like
VRIP:/filesystem:0x0001
IP2:/filesystem:0x0001
(the first colummn is nfs client, the second is shared file system and the 
third column is the mount
count.)

When something happens to host1, the failover is triggered and the VRIP is 
moved to the host2,
host1  IP1
host2  IP2, VRIP(nfs server's IP shared using vrrp)

   nfs server IP   nfs client IP
on host1   VRIPIP1
on host2   VRIPVRIP

Accesses to the nfs mounted filesystem on host1 will be denied, if the content 
of the rmtab dosen't
change, because the nfs server on host2 thinks the clients are only VRIP and 
IP2.

Of course, it can be avoided, if we unmount and remount the files system on the 
client, or if we
appropriately change the content of the rmtab when the failover occurs.

However I think, it would be much nicer if the source IP of the client was 
always the primary IP of
a interface. This is realized if the source IP is determined by the preferred 
source in routing table.

Then the IPs of the nfs server and the clients are always like this, and this 
dosen't cause any
problem when the failover happens.
   nfs server IP   nfs client IP
on host1   VRIPIP1
on host2   VRIPIP2

Thanks in advance,
Kimitoshi Takahashi, Cluster Computing Inc., Japan

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[take19 2/4] kevent: poll/select() notifications.

2006-09-20 Thread Evgeniy Polyakov

poll/select() notifications.

This patch includes generic poll/select notifications.

kevent_poll works simialr to epoll and has the same issues (callback
is invoked not from internal state machine of the caller, but through
process awake, a lot of allocations and so on.).

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2561020..a697930 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -236,6 +236,7 @@ #include linux/prio_tree.h
 #include linux/init.h
 #include linux/sched.h
 #include linux/mutex.h
+#include linux/kevent.h
 
 #include asm/atomic.h
 #include asm/semaphore.h
@@ -546,6 +547,10 @@ #ifdef CONFIG_INOTIFY
struct mutexinotify_mutex;  /* protects the watches list */
 #endif
 
+#ifdef CONFIG_KEVENT_SOCKET
+   struct kevent_storage   st;
+#endif
+
unsigned long   i_state;
unsigned long   dirtied_when;   /* jiffies of first dirtying */
 
@@ -698,6 +703,9 @@ #ifdef CONFIG_EPOLL
struct list_headf_ep_links;
spinlock_t  f_ep_lock;
 #endif /* #ifdef CONFIG_EPOLL */
+#ifdef CONFIG_KEVENT_POLL
+   struct kevent_storage   st;
+#endif
struct address_space*f_mapping;
 };
 extern spinlock_t files_lock;
diff --git a/kernel/kevent/kevent_poll.c b/kernel/kevent/kevent_poll.c
new file mode 100644
index 000..fb74e0f
--- /dev/null
+++ b/kernel/kevent/kevent_poll.c
@@ -0,0 +1,222 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/timer.h
+#include linux/file.h
+#include linux/kevent.h
+#include linux/poll.h
+#include linux/fs.h
+
+static kmem_cache_t *kevent_poll_container_cache;
+static kmem_cache_t *kevent_poll_priv_cache;
+
+struct kevent_poll_ctl
+{
+   struct poll_table_structpt;
+   struct kevent   *k;
+};
+
+struct kevent_poll_wait_container
+{
+   struct list_headcontainer_entry;
+   wait_queue_head_t   *whead;
+   wait_queue_twait;
+   struct kevent   *k;
+};
+
+struct kevent_poll_private
+{
+   struct list_headcontainer_list;
+   spinlock_t  container_lock;
+};
+
+static int kevent_poll_enqueue(struct kevent *k);
+static int kevent_poll_dequeue(struct kevent *k);
+static int kevent_poll_callback(struct kevent *k);
+
+static int kevent_poll_wait_callback(wait_queue_t *wait,
+   unsigned mode, int sync, void *key)
+{
+   struct kevent_poll_wait_container *cont =
+   container_of(wait, struct kevent_poll_wait_container, wait);
+   struct kevent *k = cont-k;
+   struct file *file = k-st-origin;
+   u32 revents;
+
+   revents = file-f_op-poll(file, NULL);
+
+   kevent_storage_ready(k-st, NULL, revents);
+
+   return 0;
+}
+
+static void kevent_poll_qproc(struct file *file, wait_queue_head_t *whead,
+   struct poll_table_struct *poll_table)
+{
+   struct kevent *k =
+   container_of(poll_table, struct kevent_poll_ctl, pt)-k;
+   struct kevent_poll_private *priv = k-priv;
+   struct kevent_poll_wait_container *cont;
+   unsigned long flags;
+
+   cont = kmem_cache_alloc(kevent_poll_container_cache, SLAB_KERNEL);
+   if (!cont) {
+   kevent_break(k);
+   return;
+   }
+
+   cont-k = k;
+   init_waitqueue_func_entry(cont-wait, kevent_poll_wait_callback);
+   cont-whead = whead;
+
+   spin_lock_irqsave(priv-container_lock, flags);
+   list_add_tail(cont-container_entry, priv-container_list);
+   spin_unlock_irqrestore(priv-container_lock, flags);
+
+   add_wait_queue(whead, cont-wait);
+}
+
+static int kevent_poll_enqueue(struct kevent *k)
+{
+   struct file *file;
+   int err, ready = 0;
+   unsigned int revents;
+   struct kevent_poll_ctl ctl;
+   struct kevent_poll_private *priv;
+
+   file = fget(k-event.id.raw[0]);
+   if (!file)
+   return -ENODEV;
+
+   err = -EINVAL;
+   if (!file-f_op || !file-f_op-poll)
+   goto err_out_fput;
+
+   err = -ENOMEM;
+   priv = kmem_cache_alloc(kevent_poll_priv_cache, SLAB_KERNEL);
+   if (!priv)
+   goto err_out_fput;
+
+  

[take19 0/4] kevent: Generic event handling mechanism.

2006-09-20 Thread Evgeniy Polyakov

Generic event handling mechanism.

Consider for inclusion.

Changes from 'take18' patchset:
 * use __init instead of __devinit
 * removed 'default N' from config for user statistic
 * removed kevent_user_fini() since kevent can not be unloaded
 * use KERN_INFO for statistic output

Changes from 'take17' patchset:
 * Use RB tree instead of hash table. 
At least for a web sever, frequency of addition/deletion of new kevent 
is comparable with number of search access, i.e. most of the time 
events 
are added, accesed only couple of times and then removed, so it 
justifies 
RB tree usage over AVL tree, since the latter does have much slower 
deletion 
time (max O(log(N)) compared to 3 ops), 
although faster search time (1.44*O(log(N)) vs. 2*O(log(N))). 
So for kevents I use RB tree for now and later, when my AVL tree 
implementation 
is ready, it will be possible to compare them.
 * Changed readiness check for socket notifications.

With both above changes it is possible to achieve more than 3380 req/second 
compared to 2200, 
sometimes 2500 req/second for epoll() for trivial web-server and httperf client 
on the same hardware.
It is possible that above kevent limit is due to maximum allowed kevents in a 
time limit, which is 4096 events.

Changes from 'take16' patchset:
 * misc cleanups (__read_mostly, const ...)
 * created special macro which is used for mmap size (number of pages) 
calculation
 * export kevent_socket_notify(), since it is used in network protocols which 
can be 
built as modules (IPv6 for example)

Changes from 'take15' patchset:
 * converted kevent_timer to high-resolution timers, this forces timer API 
update at
http://linux-net.osdl.org/index.php/Kevent
 * use struct ukevent* instead of void * in syscalls (documentation has been 
updated)
 * added warning in kevent_add_ukevent() if ring has broken index (for testing)

Changes from 'take14' patchset:
 * added kevent_wait()
This syscall waits until either timeout expires or at least one event
becomes ready. It also commits that @num events from @start are processed
by userspace and thus can be be removed or rearmed (depending on it's 
flags).
It can be used for commit events read by userspace through mmap interface.
Example userspace code (evtest.c) can be found on project's homepage.
 * added socket notifications (send/recv/accept)

Changes from 'take13' patchset:
 * do not get lock aroung user data check in __kevent_search()
 * fail early if there were no registered callbacks for given type of kevent
 * trailing whitespace cleanup

Changes from 'take12' patchset:
 * remove non-chardev interface for initialization
 * use pointer to kevent_mring instead of unsigned longs
 * use aligned 64bit type in raw user data (can be used by high-res timer if 
needed)
 * simplified enqueue/dequeue callbacks and kevent initialization
 * use nanoseconds for timeout
 * put number of milliseconds into timer's return data
 * move some definitions into user-visible header
 * removed filenames from comments

Changes from 'take11' patchset:
 * include missing headers into patchset
 * some trivial code cleanups (use goto instead of if/else games and so on)
 * some whitespace cleanups
 * check for ready_callback() callback before main loop which should save us 
some ticks

Changes from 'take10' patchset:
 * removed non-existent prototypes
 * added helper function for kevent_registered_callbacks
 * fixed 80 lines comments issues
 * added shared between userspace and kernelspace header instead of embedd them 
in one
 * core restructuring to remove forward declarations
 * s o m e w h i t e s p a c e c o d y n g s t y l e c l e a n u p
 * use vm_insert_page() instead of remap_pfn_range()

Changes from 'take9' patchset:
 * fixed -nopage method

Changes from 'take8' patchset:
 * fixed mmap release bug
 * use module_init() instead of late_initcall()
 * use better structures for timer notifications

Changes from 'take7' patchset:
 * new mmap interface (not tested, waiting for other changes to be acked)
- use nopage() method to dynamically substitue pages
- allocate new page for events only when new added kevent requres it
- do not use ugly index dereferencing, use structure instead
- reduced amount of data in the ring (id and flags), 
maximum 12 pages on x86 per kevent fd

Changes from 'take6' patchset:
 * a lot of comments!
 * do not use list poisoning for detection of the fact, that entry is in the 
list
 * return number of ready kevents even if copy*user() fails
 * strict check for number of kevents in syscall
 * use ARRAY_SIZE for array size calculation
 * changed superblock magic number
 * use SLAB_PANIC instead of direct panic() call
 * changed -E* return values
 * a lot of small cleanups and indent fixes

Changes from 'take5' patchset:
 * removed compilation warnings about unused wariables when lockdep is not 

[take19 3/4] kevent: Socket notifications.

2006-09-20 Thread Evgeniy Polyakov

Socket notifications.

This patch include socket send/recv/accept notifications.
Using trivial web server based on kevent and this features 
instead of epoll it's performance increased more than noticebly.
More details about benchmark and server itself (evserver_kevent.c)
can be found on project's homepage.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/fs/inode.c b/fs/inode.c
index 0bf9f04..181521d 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -21,6 +21,7 @@ #include linux/pagemap.h
 #include linux/cdev.h
 #include linux/bootmem.h
 #include linux/inotify.h
+#include linux/kevent.h
 #include linux/mount.h
 
 /*
@@ -165,12 +166,18 @@ #endif
}
memset(inode-u, 0, sizeof(inode-u));
inode-i_mapping = mapping;
+#if defined CONFIG_KEVENT_SOCKET
+   kevent_storage_init(inode, inode-st);
+#endif
}
return inode;
 }
 
 void destroy_inode(struct inode *inode) 
 {
+#if defined CONFIG_KEVENT_SOCKET
+   kevent_storage_fini(inode-st);
+#endif
BUG_ON(inode_has_buffers(inode));
security_inode_free(inode);
if (inode-i_sb-s_op-destroy_inode)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2561020..a697930 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -236,6 +236,7 @@ #include linux/prio_tree.h
 #include linux/init.h
 #include linux/sched.h
 #include linux/mutex.h
+#include linux/kevent.h
 
 #include asm/atomic.h
 #include asm/semaphore.h
@@ -546,6 +547,10 @@ #ifdef CONFIG_INOTIFY
struct mutexinotify_mutex;  /* protects the watches list */
 #endif
 
+#ifdef CONFIG_KEVENT_SOCKET
+   struct kevent_storage   st;
+#endif
+
unsigned long   i_state;
unsigned long   dirtied_when;   /* jiffies of first dirtying */
 
@@ -698,6 +703,9 @@ #ifdef CONFIG_EPOLL
struct list_headf_ep_links;
spinlock_t  f_ep_lock;
 #endif /* #ifdef CONFIG_EPOLL */
+#ifdef CONFIG_KEVENT_POLL
+   struct kevent_storage   st;
+#endif
struct address_space*f_mapping;
 };
 extern spinlock_t files_lock;
diff --git a/include/net/sock.h b/include/net/sock.h
index 324b3ea..5d71ed7 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -48,6 +48,7 @@ #include linux/lockdep.h
 #include linux/netdevice.h
 #include linux/skbuff.h  /* struct sk_buff */
 #include linux/security.h
+#include linux/kevent.h
 
 #include linux/filter.h
 
@@ -450,6 +451,21 @@ static inline int sk_stream_memory_free(
 
 extern void sk_stream_rfree(struct sk_buff *skb);
 
+struct socket_alloc {
+   struct socket socket;
+   struct inode vfs_inode;
+};
+
+static inline struct socket *SOCKET_I(struct inode *inode)
+{
+   return container_of(inode, struct socket_alloc, vfs_inode)-socket;
+}
+
+static inline struct inode *SOCK_INODE(struct socket *socket)
+{
+   return container_of(socket, struct socket_alloc, socket)-vfs_inode;
+}
+
 static inline void sk_stream_set_owner_r(struct sk_buff *skb, struct sock *sk)
 {
skb-sk = sk;
@@ -477,6 +493,7 @@ static inline void sk_add_backlog(struct
sk-sk_backlog.tail = skb;
}
skb-next = NULL;
+   kevent_socket_notify(sk, KEVENT_SOCKET_RECV);
 }
 
 #define sk_wait_event(__sk, __timeo, __condition)  \
@@ -679,21 +696,6 @@ static inline struct kiocb *siocb_to_kio
return si-kiocb;
 }
 
-struct socket_alloc {
-   struct socket socket;
-   struct inode vfs_inode;
-};
-
-static inline struct socket *SOCKET_I(struct inode *inode)
-{
-   return container_of(inode, struct socket_alloc, vfs_inode)-socket;
-}
-
-static inline struct inode *SOCK_INODE(struct socket *socket)
-{
-   return container_of(socket, struct socket_alloc, socket)-vfs_inode;
-}
-
 extern void __sk_stream_mem_reclaim(struct sock *sk);
 extern int sk_stream_mem_schedule(struct sock *sk, int size, int kind);
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7a093d0..69f4ad2 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -857,6 +857,7 @@ static inline int tcp_prequeue(struct so
tp-ucopy.memory = 0;
} else if (skb_queue_len(tp-ucopy.prequeue) == 1) {
wake_up_interruptible(sk-sk_sleep);
+   kevent_socket_notify(sk, 
KEVENT_SOCKET_RECV|KEVENT_SOCKET_SEND);
if (!inet_csk_ack_scheduled(sk))
inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
  (3 * TCP_RTO_MIN) / 4,
diff --git a/kernel/kevent/kevent_socket.c b/kernel/kevent/kevent_socket.c
new file mode 100644
index 000..1ddd2a1
--- /dev/null
+++ b/kernel/kevent/kevent_socket.c
@@ -0,0 +1,126 @@
+/*
+ * kevent_socket.c
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the 

[take19 4/4] kevent: Timer notifications.

2006-09-20 Thread Evgeniy Polyakov

Timer notifications.

Timer notifications can be used for fine grained per-process time 
management, since interval timers are very inconvenient to use, 
and they are limited.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/kernel/kevent/kevent_timer.c b/kernel/kevent/kevent_timer.c
new file mode 100644
index 000..04acc46
--- /dev/null
+++ b/kernel/kevent/kevent_timer.c
@@ -0,0 +1,113 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/hrtimer.h
+#include linux/jiffies.h
+#include linux/kevent.h
+
+struct kevent_timer
+{
+   struct hrtimer  ktimer;
+   struct kevent_storage   ktimer_storage;
+   struct kevent   *ktimer_event;
+};
+
+static int kevent_timer_func(struct hrtimer *timer)
+{
+   struct kevent_timer *t = container_of(timer, struct kevent_timer, 
ktimer);
+   struct kevent *k = t-ktimer_event;
+
+   kevent_storage_ready(t-ktimer_storage, NULL, KEVENT_MASK_ALL);
+   hrtimer_forward(timer, timer-base-softirq_time,
+   ktime_set(k-event.id.raw[0], k-event.id.raw[1]));
+   return HRTIMER_RESTART;
+}
+
+static struct lock_class_key kevent_timer_key;
+
+static int kevent_timer_enqueue(struct kevent *k)
+{
+   int err;
+   struct kevent_timer *t;
+
+   t = kmalloc(sizeof(struct kevent_timer), GFP_KERNEL);
+   if (!t)
+   return -ENOMEM;
+
+   hrtimer_init(t-ktimer, CLOCK_MONOTONIC, HRTIMER_REL);
+   t-ktimer.expires = ktime_set(k-event.id.raw[0], k-event.id.raw[1]);
+   t-ktimer.function = kevent_timer_func;
+   t-ktimer_event = k;
+
+   err = kevent_storage_init(t-ktimer, t-ktimer_storage);
+   if (err)
+   goto err_out_free;
+   lockdep_set_class(t-ktimer_storage.lock, kevent_timer_key);
+
+   err = kevent_storage_enqueue(t-ktimer_storage, k);
+   if (err)
+   goto err_out_st_fini;
+
+   printk(%s: jiffies: %lu, timer: %p.\n, __func__, jiffies, t-ktimer);
+   hrtimer_start(t-ktimer, t-ktimer.expires, HRTIMER_REL);
+
+   return 0;
+
+err_out_st_fini:
+   kevent_storage_fini(t-ktimer_storage);
+err_out_free:
+   kfree(t);
+
+   return err;
+}
+
+static int kevent_timer_dequeue(struct kevent *k)
+{
+   struct kevent_storage *st = k-st;
+   struct kevent_timer *t = container_of(st, struct kevent_timer, 
ktimer_storage);
+
+   hrtimer_cancel(t-ktimer);
+   kevent_storage_dequeue(st, k);
+   kfree(t);
+
+   return 0;
+}
+
+static int kevent_timer_callback(struct kevent *k)
+{
+   k-event.ret_data[0] = jiffies_to_msecs(jiffies);
+   return 1;
+}
+
+static int __init kevent_init_timer(void)
+{
+   struct kevent_callbacks tc = {
+   .callback = kevent_timer_callback,
+   .enqueue = kevent_timer_enqueue,
+   .dequeue = kevent_timer_dequeue};
+
+   return kevent_add_callbacks(tc, KEVENT_TIMER);
+}
+module_init(kevent_init_timer);
+

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[take19 1/4] kevent: Core files.

2006-09-20 Thread Evgeniy Polyakov

Core files.

This patch includes core kevent files:
 - userspace controlling
 - kernelspace interfaces
 - initialization
 - notification state machines

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/arch/i386/kernel/syscall_table.S b/arch/i386/kernel/syscall_table.S
index dd63d47..c10698e 100644
--- a/arch/i386/kernel/syscall_table.S
+++ b/arch/i386/kernel/syscall_table.S
@@ -317,3 +317,6 @@ ENTRY(sys_call_table)
.long sys_tee   /* 315 */
.long sys_vmsplice
.long sys_move_pages
+   .long sys_kevent_get_events
+   .long sys_kevent_ctl
+   .long sys_kevent_wait   /* 320 */
diff --git a/arch/x86_64/ia32/ia32entry.S b/arch/x86_64/ia32/ia32entry.S
index 5d4a7d1..a06b76f 100644
--- a/arch/x86_64/ia32/ia32entry.S
+++ b/arch/x86_64/ia32/ia32entry.S
@@ -710,7 +710,10 @@ #endif
.quad compat_sys_get_robust_list
.quad sys_splice
.quad sys_sync_file_range
-   .quad sys_tee
+   .quad sys_tee   /* 315 */
.quad compat_sys_vmsplice
.quad compat_sys_move_pages
+   .quad sys_kevent_get_events
+   .quad sys_kevent_ctl
+   .quad sys_kevent_wait   /* 320 */
 ia32_syscall_end:  
diff --git a/include/asm-i386/unistd.h b/include/asm-i386/unistd.h
index fc1c8dd..68072b5 100644
--- a/include/asm-i386/unistd.h
+++ b/include/asm-i386/unistd.h
@@ -323,10 +323,13 @@ #define __NR_sync_file_range  314
 #define __NR_tee   315
 #define __NR_vmsplice  316
 #define __NR_move_pages317
+#define __NR_kevent_get_events 318
+#define __NR_kevent_ctl319
+#define __NR_kevent_wait   320
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 318
+#define NR_syscalls 321
 
 /*
  * user-visible error numbers are in the range -1 - -128: see
diff --git a/include/asm-x86_64/unistd.h b/include/asm-x86_64/unistd.h
index 94387c9..ee907ad 100644
--- a/include/asm-x86_64/unistd.h
+++ b/include/asm-x86_64/unistd.h
@@ -619,10 +619,16 @@ #define __NR_vmsplice 278
 __SYSCALL(__NR_vmsplice, sys_vmsplice)
 #define __NR_move_pages279
 __SYSCALL(__NR_move_pages, sys_move_pages)
+#define __NR_kevent_get_events 280
+__SYSCALL(__NR_kevent_get_events, sys_kevent_get_events)
+#define __NR_kevent_ctl281
+__SYSCALL(__NR_kevent_ctl, sys_kevent_ctl)
+#define __NR_kevent_wait   282
+__SYSCALL(__NR_kevent_wait, sys_kevent_wait)
 
 #ifdef __KERNEL__
 
-#define __NR_syscall_max __NR_move_pages
+#define __NR_syscall_max __NR_kevent_wait
 
 #ifndef __NO_STUBS
 
diff --git a/include/linux/kevent.h b/include/linux/kevent.h
new file mode 100644
index 000..24ced10
--- /dev/null
+++ b/include/linux/kevent.h
@@ -0,0 +1,195 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __KEVENT_H
+#define __KEVENT_H
+#include linux/types.h
+#include linux/list.h
+#include linux/rbtree.h
+#include linux/spinlock.h
+#include linux/mutex.h
+#include linux/wait.h
+#include linux/net.h
+#include linux/rcupdate.h
+#include linux/kevent_storage.h
+#include linux/ukevent.h
+
+#define KEVENT_MIN_BUFFS_ALLOC 3
+
+struct kevent;
+struct kevent_storage;
+typedef int (* kevent_callback_t)(struct kevent *);
+
+/* @callback is called each time new event has been caught. */
+/* @enqueue is called each time new event is queued. */
+/* @dequeue is called each time event is dequeued. */
+
+struct kevent_callbacks {
+   kevent_callback_t   callback, enqueue, dequeue;
+};
+
+#define KEVENT_READY   0x1
+#define KEVENT_STORAGE 0x2
+#define KEVENT_USER0x4
+
+struct kevent
+{
+   /* Used for kevent freeing.*/
+   struct rcu_head rcu_head;
+   struct ukevent  event;
+   /* This lock protects ukevent manipulations, e.g. ret_flags changes. */
+   spinlock_t  ulock;
+
+   /* Entry of user's tree. */
+   struct rb_node  kevent_node;
+   /* Entry of origin's queue. */
+   struct list_headstorage_entry;
+   /* Entry of user's ready. */
+   struct list_headready_entry;
+
+   u32 flags;
+
+   /* User who requested this kevent. */
+   

RE: [patch 3/3] Add tsi108 On Chip Ethernet device driver support

2006-09-20 Thread Arjan van de Ven
On Tue, 2006-09-19 at 15:39 +0800, Zang Roy-r61911 wrote:

  
   + spin_unlock_irq(phy_lock);
   + msleep(10);
   + spin_lock_irq(phy_lock);
   + }
  
  hmm some places take phy_lock with disabling interrupts, while others
  don't. I sort of fear the others may be buggy are you sure those
  are ok?
 Could you interpret your comments in detail?
 Roy

Hi,

sorry for being unclear/too short in the review.

The phy_lock lock is sometimes taken as spin_lock() and sometimes as
spin_lock_irq(). It looks likes it can be used in interrupt context, in
which case the spin_lock_irq() version is correct and the places where
spin_lock() is used would be a deadlock bug (just think what happens if
the interrupt happens while spin_lock(phy_lock) is helt, and the
spinlock then again tries to take the lock!)

If there is no way this lock is used in interrupt context, then the
spin_lock_irq() version is doing something which is not needed and also
a bit expensive; so could be optimized.

But my impression is that the _irq() is needed. Also, please consider
switching from spin_lock_irq() to spin_lock_irqsave() version instead;
spin_unlock_irq() has some side effects (interrupts get enabled
unconditionally) so it is generally safer to use
spin_lock_irqsave()/spin_unlock_irqrestore() API.

If you have more questions please do not hesitate to ask!

Greetings,
   Arjan van de Ven 
-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-rc7-mm1

2006-09-20 Thread Mike Galbraith
On Tue, 2006-09-19 at 13:36 -0700, Andrew Morton wrote:
 On Tue, 19 Sep 2006 22:25:21 +0200
 Rafael J. Wysocki [EMAIL PROTECTED] wrote:
 
   - It took maybe ten hours solid work to get this dogpile vaguely
 compiling and limping to a login prompt on x86, x86_64 and powerpc. 
 I guess it's worth briefly testing if you're keen.
  
  It's not that bad, but unfortunately the networking doesn't work on my 
  system
  (HPC nx6325 + SUSE 10.1 w/ updates, 64-bit).  Apparently, the interfaces 
  don't
  get configured (both tg3 and bcm43xx are affected).
 
 Is there anything interesting in the dmesg output?
 
 Perhaps an `strace -f ifup' or whatever would tell us what's failing.

FYI, it`s SuSE`s /sbin/getcfg binary that doesn't like the changes.  It
sees /sys/class/net/eth0 as a symlink, and reels off into sys/block (?)
looking for a directory.

lstat64(/sys/class/net/eth0, {st_dev=makedev(0, 0), st_ino=5968, 
st_mode=S_IFLNK|0777, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, 
st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:13, 
st_mtime=2006/09/20-13:58:57, st_ctime=2006/09/20-13:58:57}) = 0
lstat64(/sys/block/eth0, 0xbf9e432c)  = -1 ENOENT (No such file or directory)
open(/proc/mounts, O_RDONLY)  = 3
fstat64(3, {st_dev=makedev(0, 3), st_ino=22711, st_mode=S_IFREG|0444, 
st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, 
st_atime=2006/09/20-14:00:35, st_mtime=2006/09/20-14:00:35, 
st_ctime=2006/09/20-14:00:35}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xb7f59000
read(3, rootfs / rootfs rw 0 0\nudev /dev..., 4096) = 601
close(3)= 0
munmap(0xb7f59000, 4096)= 0
lstat64(/sys/block, {st_dev=makedev(0, 0), st_ino=256, st_mode=S_IFDIR|0755, 
st_nlink=18, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, 
st_atime=2006/09/20-14:00:17, st_mtime=2006/09/20-13:58:59, 
st_ctime=2006/09/20-13:58:59}) = 0
lstat64(/sys/block, {st_dev=makedev(0, 0), st_ino=256, st_mode=S_IFDIR|0755, 
st_nlink=18, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, 
st_atime=2006/09/20-14:00:17, st_mtime=2006/09/20-13:58:59, 
st_ctime=2006/09/20-13:58:59}) = 0
open(/dev/null, O_RDONLY|O_NONBLOCK|O_DIRECTORY) = -1 ENOTDIR (Not a 
directory)
open(/sys/block, O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
fstat64(3, {st_dev=makedev(0, 0), st_ino=256, st_mode=S_IFDIR|0755, 
st_nlink=18, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, 
st_atime=2006/09/20-14:00:17, st_mtime=2006/09/20-13:58:59, 
st_ctime=2006/09/20-13:58:59}) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
getdents64(3, {{d_ino=256, d_off=1, d_type=DT_DIR, d_reclen=24, d_name=.} 
{d_ino=1, d_off=2, d_type=DT_DIR, d_reclen=24, d_name=..} {d_ino=11521, 
d_off=3, d_type=DT_DIR, d_reclen=24, d_name=sde} {d_ino=11455, d_off=4, 
d_type=DT_DIR, d_reclen=24, d_name=sdd} {d_ino=11416, d_off=5, d_type=DT_DIR, 
d_reclen=24, d_name=sdc} {d_ino=11358, d_off=6, d_type=DT_DIR, d_reclen=24, 
d_name=sdb} {d_ino=11311, d_off=7, d_type=DT_DIR, d_reclen=24, d_name=sda} 
{d_ino=1784, d_off=8, d_type=DT_DIR, d_reclen=24, d_name=hdd} {d_ino=1770, 
d_off=9, d_type=DT_DIR, d_reclen=24, d_name=hdc} {d_ino=1757, d_off=10, 
d_type=DT_DIR, d_reclen=24, d_name=hda} {d_ino=1725, d_off=11, d_type=DT_DIR, 
d_reclen=32, d_name=loop7} {d_ino=1722, d_off=12, d_type=DT_DIR, d_reclen=32, 
d_name=loop6} {d_ino=1719, d_off=13, d_type=DT_DIR, d_reclen=32, 
d_name=loop5} {d_ino=1716, d_off=14, d_type=DT_DIR, d_reclen=32, 
d_name=loop4} {d_ino=1713, d_off=15, d_type=DT_DIR, d_reclen=32, 
d_name=loop3} {d_ino=1710, d_off=16, d_type=DT_DIR, d_reclen=32, 
d_name=loop2} {d_ino=1707, d_off=17, d_type=DT_DIR, d_reclen=32, 
d_name=loop1} {d_ino=1704, d_off=18, d_type=DT_DIR, d_reclen=32, 
d_name=loop0}}, 4096) = 496
lstat64(/sys/block/sde, {st_dev=makedev(0, 0), st_ino=11521, 
st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, 
st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, 
st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64(/sys/block/sde, {st_dev=makedev(0, 0), st_ino=11521, 
st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, 
st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, 
st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64(/sys/block/sdd, {st_dev=makedev(0, 0), st_ino=11455, 
st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, 
st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, 
st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64(/sys/block/sdd, {st_dev=makedev(0, 0), st_ino=11455, 
st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, 
st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, 
st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64(/sys/block/sdc, {st_dev=makedev(0, 0), st_ino=11416, 
st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, 
st_blocks=0, 

Re: 2.6.18-rc7-mm1

2006-09-20 Thread Rafael J. Wysocki
On Wednesday, 20 September 2006 16:23, Mike Galbraith wrote:
 On Tue, 2006-09-19 at 13:36 -0700, Andrew Morton wrote:
  On Tue, 19 Sep 2006 22:25:21 +0200
  Rafael J. Wysocki [EMAIL PROTECTED] wrote:
  
- It took maybe ten hours solid work to get this dogpile vaguely
  compiling and limping to a login prompt on x86, x86_64 and powerpc. 
  I guess it's worth briefly testing if you're keen.
   
   It's not that bad, but unfortunately the networking doesn't work on my 
   system
   (HPC nx6325 + SUSE 10.1 w/ updates, 64-bit).  Apparently, the interfaces 
   don't
   get configured (both tg3 and bcm43xx are affected).
  
  Is there anything interesting in the dmesg output?
  
  Perhaps an `strace -f ifup' or whatever would tell us what's failing.
 
 FYI, it`s SuSE`s /sbin/getcfg binary that doesn't like the changes.  It
 sees /sys/class/net/eth0 as a symlink, and reels off into sys/block (?)
 looking for a directory.

I have filed a report in the SUSE bugzilla.  Let's see what happens.

Greetings,
Rafael


-- 
You never change things by fighting the existing reality.
R. Buckminster Fuller
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: UDP Out 0f Sequence

2006-09-20 Thread kc

network transit. different datagrams might go through different
routes, hence the out-of-sequence arrival.

On 9/20/06, Majumder, Rajib [EMAIL PROTECTED] wrote:

Hi,

If I write UDP datagrams 1,2 and 3 to network and if the receiver receives in 
order 2,1, and 3, where can the sequence get changed? Is it at the source 
stack, network transit or destination stack?

Any reply is highly appreciated.

Thanks

Rajib

==
Please access the attached hyperlink for an important electronic communications 
disclaimer:

http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/4] Make sure ip_vs_ftp ports are valid

2006-09-20 Thread Horms
On Wed, Sep 20, 2006 at 12:29:45PM +0200, Patrick McHardy wrote:
 Horms wrote:
  Here is the revised patch.
  
 
  [IPVS] Make sure ip_vs_ftp ports are valid
 
  I'm not entirely sure what happens in the case of a valid port,
  at best it'll be silently ignored. This patch ensures that
  the port values are unsigned short values, and thus always valid.
 
  Cc: Patrick McHardy [EMAIL PROTECTED]
  Signed-Off-By: Simon Horman [EMAIL PROTECTED]
 
  Index: linux-2.6/net/ipv4/ipvs/ip_vs_ftp.c
  ===
  --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ftp.c2006-09-04
 10:47:09.0 +0900
  +++ linux-2.6/net/ipv4/ipvs/ip_vs_ftp.c 2006-09-04 10:59:30.0
 +0900
  @@ -44,8 +44,8 @@
* List of ports (up to IP_VS_APP_MAX_PORTS) to be handled by helper
* First port is set to the default port.
*/
  -static int ports[IP_VS_APP_MAX_PORTS] = {21, 0};
  -module_param_array(ports, int, NULL, 0);
  +static unsigned short ports[IP_VS_APP_MAX_PORTS] = {21, 0};
  +module_param_array(ports, ushort, NULL, 0);
   MODULE_PARM_DESC(ports, Ports to monitor for FTP control commands);
 
   /*
 
 It looks like the wrong patch went in:
 
 http://marc.theaimsgroup.com/?l=git-commits-headm=115862407021941w=2

Thanks for pointing that out. I'll send out patches to reverse
the committed change, and add the newer incarntation.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipvs locahost client patch for 2.6?

2006-09-20 Thread Horms
On Fri, 11 Aug 2006 01:18:38, Ryan Nowakowski wrote:
 I found this patch for 2.4 that allows the host running ipvs to act
 as it's own client via loopback connection.  Does anyone have a similar
 patch for 2.6?

Not that I am aware of, though that kind of approach may well work for
2.6 with little effort.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/23] e1000: Maybe stop TX if not enough free descriptors

2006-09-20 Thread Auke Kok

Jeff Garzik wrote:

Actually, I rescind the ACK.

The code should be inside a spinlock, and therefore not need this 
additional check.


If this check were truly needed, then SMP code all over the kernel would 
be broken.


I will drop the patch for now. Once Jesse is back next week he gets to explain 
all it to me :)


Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/23] e1000: Jumbo frames fixes for 82573

2006-09-20 Thread Auke Kok

Jeff Garzik wrote:

Kok, Auke wrote:

Disable jumbo frames for 82573L alltogether and when ASPM is enabled
since the hardware has problems with it. For the NICs that do support
this in the 82573 series we set ERT_2048 to attempt to receive as much
traffic as early as we can.

Signed-off-by: Bruce Allan [EMAIL PROTECTED]
Signed-off-by: Auke Kok [EMAIL PROTECTED]
---

 drivers/net/e1000/e1000_main.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c 
b/drivers/net/e1000/e1000_main.c

index e81aa03..2ecec51 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3138,11 +3138,13 @@ e1000_change_mtu(struct net_device *netd
 }
 break;
 case e1000_82573:
-/* only enable jumbo frames if ASPM is disabled completely
- * this means both bits must be zero in 0x1A bits 3:2 */
+/* Jumbo Frames not supported if:
+ * - this is not an 82573L device
+ * - ASPM is enabled in any way (0x1A bits 3:2) */
 e1000_read_eeprom(adapter-hw, EEPROM_INIT_3GIO_3, 1,
   eeprom_data);
-if (eeprom_data  EEPROM_WORD1A_ASPM_MASK) {
+if ((adapter-hw.device_id != E1000_DEV_ID_82573L) ||
+(eeprom_data  EEPROM_WORD1A_ASPM_MASK)) {
 if (max_frame  MAXIMUM_ETHERNET_FRAME_SIZE) {
 DPRINTK(PROBE, ERR,
 Jumbo Frames not supported.\n);


NAK.  at probe time, set a jumbo-frames-enabled bit, then test it in 
e1000_change_mtu().


Don't include all this chip-checking code into the change_mtu function.


I agree with the concept, not with the NAK. This workaround was already there 
and is not a significant new introduction of out-of-band workarounds in code.


Cleaning e1000 up is a major task that will take a few more months. This 
workaround changes 3 lines of code and will help today.


Cheers,

Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/9] network namespaces: socket hashes

2006-09-20 Thread Andrey Savochkin
Hi,

On Mon, Sep 18, 2006 at 05:12:49PM +0200, Daniel Lezcano wrote:
 Andrey Savochkin wrote:
  Socket hash lookups are made within namespace.
  Hash tables are common for all namespaces, with
  additional permutation of indexes.
 
 Hi Andrey,
 
 why is the hash table common and not instanciated multiple times for 
 each namespace like the routes ?

The main reason is that socket hash tables should be large enough to work
efficiently, but it isn't good to waste a lot of memory for each namespace.
Namespaces should be cheap enough, to allow to have hundreds of them.
This reason of memory efficiency, of course, has a priority unless/until
socket hash tables start to resize automatically.

Another point is that routing lookup is much more complicated than the
socket's one to add another search key.
Routing also have additional routines for deleting entries matching some
patterns, and so on.
In short, routing is much more complicated, and it already quite efficient
for various sizes of routing tables.

Andrey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/23] e1000: add multicast stats counters

2006-09-20 Thread Williams, Mitch A
 +{ rx_broadcast, E1000_STAT(stats.bprc) },
 +{ tx_broadcast, E1000_STAT(stats.bptc) },
 +{ rx_multicast, E1000_STAT(stats.mprc) },
 +{ tx_multicast, E1000_STAT(stats.mptc) },
  { rx_errors, E1000_STAT(net_stats.rx_errors) },
  { tx_errors, E1000_STAT(net_stats.tx_errors) },
  { tx_dropped, E1000_STAT(net_stats.tx_dropped) },

NAK -- you also need to remove the standard net stats, which are 
exported elsewhere

Jeff, can you please explain the reason for this NAK a little more?
Neither Auke nor I understand why you rejected the patch.

This patch just adds the display of a few more stats in Ethtool.  It
doesn't affect any other counters, and is really just a convenience
feature.  I added this to the driver because of a customer request.

Thank you in advance for edifying us.

-Mitch

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: UDP Out 0f Sequence

2006-09-20 Thread Rick Jones

Majumder, Rajib wrote:

Hi,

If I write UDP datagrams 1,2 and 3 to network and if the receiver
receives in order 2,1, and 3, where can the sequence get changed? Is it
at the source stack, network transit or destination stack?


Yes. :)

Although network transit is by far the most likely case.  Destination 
stack is a distant second and source stack an even more distant third. 
Generally stack writers try to avoid having places in their stacks where 
things can reorder, but it isn't completely unknown.


rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Pull request for 'r8169-20060920-00' tag

2006-09-20 Thread Francois Romieu
Please pull from tag 'r8169-20060920-00' in repository
 
git://electric-eye.fr.zoreil.com/home/romieu/linux-2.6.git

to get the change below.

Note:
Since something went wrong last time I submitted a pull request, here
goes the trace from the start of the branch

$ git rev-list $(git merge-base v2.6.18 r8169-20060920-00)..
d81bf551103cc3bc9e4f7ddf337511d6da0d088f
b39fe41f481d20c201012e4483e76c203802dda7 (- r8169-20060912-00)
d2eed8cff9a1a5d7e12ec9ddf71432c466b104d0
5f787a1aca3705bdc6adbda36f8d6446380e85a6
64e4bfb40c9d07a48c1c7e5b8556e92e7cd7406a
5b0384f4fd079c24b976ee333e6d1f0c95cf14de
b518fa8eac2d0ac497c0fdb27e4cec68d0249bb7
188f4af04618b32b8ec7c630a3f18201c81ce70c
bcf0bf90cd9e9242b66e0563b6a8c8db2e4c262c
4ff96fa67379c31ced69f193c7ffba17051f38e8
623a1593c84afb86b2f496a56fb4ec37f82b5c78
9dccf61112e6755f4e6f154c1794bab3c509bc71
a2b98a697fa4e7564f78905b83db122824916cf9

Shortlog

 
Francois Romieu :
r8169: the MMIO region of the 8167 stands behin BAR#1
 
Patch
-

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 805562b..93cd1f4 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -210,7 +210,7 @@ static const struct {
 static struct pci_device_id rtl8169_pci_tbl[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8129), 0, 0, RTL_CFG_0 },
{ PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8136), 0, 0, RTL_CFG_2 },
-   { PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8167), 0, 0, RTL_CFG_1 },
+   { PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8167), 0, 0, RTL_CFG_0 },
{ PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8168), 0, 0, RTL_CFG_2 },
{ PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8169), 0, 0, RTL_CFG_0 },
{ PCI_DEVICE(PCI_VENDOR_ID_DLINK,   0x4300), 0, 0, RTL_CFG_0 },

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/23] e1000: add multicast stats counters

2006-09-20 Thread Jeff Garzik

Williams, Mitch A wrote:

+   { rx_broadcast, E1000_STAT(stats.bprc) },
+   { tx_broadcast, E1000_STAT(stats.bptc) },
+   { rx_multicast, E1000_STAT(stats.mprc) },
+   { tx_multicast, E1000_STAT(stats.mptc) },
{ rx_errors, E1000_STAT(net_stats.rx_errors) },
{ tx_errors, E1000_STAT(net_stats.tx_errors) },
{ tx_dropped, E1000_STAT(net_stats.tx_dropped) },
NAK -- you also need to remove the standard net stats, which are 
exported elsewhere


Jeff, can you please explain the reason for this NAK a little more?
Neither Auke nor I understand why you rejected the patch.

This patch just adds the display of a few more stats in Ethtool.  It
doesn't affect any other counters, and is really just a convenience
feature.  I added this to the driver because of a customer request.


Adding those stats is fine.  You guys just need to remove the existing 
mess first.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] mv643xx_eth: restrict to 32-bit PPC_MULTIPLATFORM

2006-09-20 Thread Dale Farnsworth
From: Dale Farnsworth [EMAIL PROTECTED]

No 64-bit PPC_MULTIPLATFORM platforms use the mv643xx_eth driver,
so build it only on PPC32.

Signed-off-by: Dale Farnsworth [EMAIL PROTECTED]
Acked-by: Sven Luther [EMAIL PROTECTED]

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index a2bd811..2154ae2 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2262,7 +2262,7 @@ config UGETH_HAS_GIGA
 
 config MV643XX_ETH
tristate MV-643XX Ethernet support
-   depends on MOMENCO_OCELOT_C || MOMENCO_JAGUAR_ATX || MV64360 || 
MOMENCO_OCELOT_3 || PPC_MULTIPLATFORM
+   depends on MOMENCO_OCELOT_C || MOMENCO_JAGUAR_ATX || MV64360 || 
MOMENCO_OCELOT_3 || (PPC_MULTIPLATFORM  PPC32)
select MII
help
  This driver supports the gigabit Ethernet on the Marvell MV643XX
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] mv643xx_eth: Fix typo: RX_SKB_SIZE == ETH_RX_SKB_SIZE

2006-09-20 Thread Dale Farnsworth
From: Dale Farnsworth [EMAIL PROTECTED]

Bug was introduced in commit 71d28725548be203e8b8f6ad63b1f64fd7f02d4d.
How embarrassing.  It wasn't caught because dma_umap_single()
is defined away on arch/ppc and 32-bit arch/powerpc.

Signed-off-by: Dale Farnsworth [EMAIL PROTECTED]

---

Arggh.  (And that's not pirate talk.)

This isn't urgent since dma_unmap_single() is defined away for ppc32
both in arch/ppc and arch/powerpc.  It was caught on ppc64 arch/powerpc,
but isn't needed by any ppc64 platforms.

diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c
index eeab1df..59de3e7 100644
--- a/drivers/net/mv643xx_eth.c
+++ b/drivers/net/mv643xx_eth.c
@@ -385,7 +385,7 @@ static int mv643xx_eth_receive_queue(str
struct pkt_info pkt_info;
 
while (budget--  0  eth_port_receive(mp, pkt_info) == ETH_OK) {
-   dma_unmap_single(NULL, pkt_info.buf_ptr, RX_SKB_SIZE,
+   dma_unmap_single(NULL, pkt_info.buf_ptr, ETH_RX_SKB_SIZE,
DMA_FROM_DEVICE);
mp-rx_desc_count--;
received_packets++;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[NET] GT96100: Delete bitrotting ethernet driver

2006-09-20 Thread Ralf Baechle
Code for the EV96100 evaluation board hasn't compiled since at least
November 15, 2003, so it is being deleted as of 2.6.18 due to lack of
a user base.

Signed-off-by: Ralf Baechle [EMAIL PROTECTED]

 drivers/net/Kconfig  |6
 drivers/net/Makefile |1
 drivers/net/gt96100eth.c | 1566 --
 drivers/net/gt96100eth.h |  346 --
 4 files changed, 0 insertions(+), 1919 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 778fbae..0ee6d60 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -446,12 +446,6 @@ config GALILEO_64240_ETH
  This is the driver for the ethernet interfaces integrated into
  the Galileo (now Marvell) GT64240 chipset.
 
-config MIPS_GT96100ETH
-   bool MIPS GT96100 Ethernet support
-   depends on NET_ETHERNET  MIPS_GT96100
-   help
- Say Y here to support the Ethernet subsystem on your GT96100 card.
-
 config MIPS_AU1X00_ENET
bool MIPS AU1000 Ethernet support
depends on NET_ETHERNET  SOC_AU1X00
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index faf24de..eb48c55 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -179,7 +179,6 @@ obj-$(CONFIG_HPLANCE) += hplance.o 7990.
 obj-$(CONFIG_MVME147_NET) += mvme147.o 7990.o
 obj-$(CONFIG_EQUALIZER) += eql.o
 obj-$(CONFIG_MIPS_JAZZ_SONIC) += jazzsonic.o
-obj-$(CONFIG_MIPS_GT96100ETH) += gt96100eth.o
 obj-$(CONFIG_MIPS_AU1X00_ENET) += au1000_eth.o
 obj-$(CONFIG_MIPS_SIM_NET) += mipsnet.o
 obj-$(CONFIG_SGI_IOC3_ETH) += ioc3-eth.o
diff --git a/drivers/net/gt96100eth.c b/drivers/net/gt96100eth.c
deleted file mode 100644
index 2b4db74..000
--- a/drivers/net/gt96100eth.c
+++ /dev/null
@@ -1,1566 +0,0 @@
-/*
- * Copyright 2000, 2001 MontaVista Software Inc.
- * Author: MontaVista Software, Inc.
- * [EMAIL PROTECTED] or [EMAIL PROTECTED]
- *
- *  This program is free software; you can distribute it and/or modify it
- *  under the terms of the GNU General Public License (Version 2) as
- *  published by the Free Software Foundation.
- *
- *  This program is distributed in the hope it will be useful, but WITHOUT
- *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- *  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
- *  for more details.
- *
- *  You should have received a copy of the GNU General Public License along
- *  with this program; if not, write to the Free Software Foundation, Inc.,
- *  59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
- *
- * Ethernet driver for the MIPS GT96100 Advanced Communication Controller.
- * 
- *  Revision history
- *
- *11.11.2001  Moved to 2.4.14, [EMAIL PROTECTED]  Modified driver to add
- *proper gt96100A support.
- *12.05.2001  Moved eth port 0 to irq 3 (mapped to GT_SERINT0 on EV96100A)
- *in order for both ports to work. Also cleaned up boot
- *option support (mac address string parsing), fleshed out
- *gt96100_cleanup_module(), and other general code cleanups
- *[EMAIL PROTECTED].
- */
-#include linux/module.h
-#include linux/kernel.h
-#include linux/string.h
-#include linux/timer.h
-#include linux/errno.h
-#include linux/in.h
-#include linux/ioport.h
-#include linux/slab.h
-#include linux/interrupt.h
-#include linux/pci.h
-#include linux/init.h
-#include linux/netdevice.h
-#include linux/etherdevice.h
-#include linux/skbuff.h
-#include linux/delay.h
-#include linux/ctype.h
-#include linux/bitops.h
-
-#include asm/irq.h
-#include asm/io.h
-
-#define DESC_BE 1
-#define DESC_DATA_BE 1
-
-#define GT96100_DEBUG 2
-
-#include gt96100eth.h
-
-// prototypes
-static void* dmaalloc(size_t size, dma_addr_t *dma_handle);
-static void dmafree(size_t size, void *vaddr);
-static void gt96100_delay(int msec);
-static int gt96100_add_hash_entry(struct net_device *dev,
- unsigned char* addr);
-static void read_mib_counters(struct gt96100_private *gp);
-static int read_MII(int phy_addr, u32 reg);
-static int write_MII(int phy_addr, u32 reg, u16 data);
-static int gt96100_init_module(void);
-static void gt96100_cleanup_module(void);
-static void dump_MII(int dbg_lvl, struct net_device *dev);
-static void dump_tx_desc(int dbg_lvl, struct net_device *dev, int i);
-static void dump_rx_desc(int dbg_lvl, struct net_device *dev, int i);
-static void dump_skb(int dbg_lvl, struct net_device *dev,
-struct sk_buff *skb);
-static void update_stats(struct gt96100_private *gp);
-static void abort(struct net_device *dev, u32 abort_bits);
-static void hard_stop(struct net_device *dev);
-static void enable_ether_irq(struct net_device *dev);
-static void disable_ether_irq(struct net_device *dev);
-static int gt96100_probe1(struct pci_dev *pci, int port_num);
-static void reset_tx(struct net_device *dev);
-static void reset_rx(struct net_device *dev);
-static int 

RE: [PATCH 4/7] secid reconciliation-v02: Invoke LSM hook for out bound traffic

2006-09-20 Thread Venkat Yekkirala
See below.

 -Original Message-
 From: James Morris [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 18, 2006 2:12 PM
 To: Venkat Yekkirala
 Cc: netdev@vger.kernel.org; [EMAIL PROTECTED]; [EMAIL PROTECTED];
 [EMAIL PROTECTED]
 Subject: Re: [PATCH 4/7] secid reconciliation-v02: Invoke LSM hook for
 outbound traffic
 
 
 On Fri, 8 Sep 2006, Venkat Yekkirala wrote:
 
  -static void secmark_restore(struct sk_buff *skb)
  +static unsigned int secmark_restore(struct sk_buff *skb, 
 unsigned int
  hooknum,
  +  const struct xt_target *target)
  {
  -   if (!skb-secmark) {
  -   u32 *connsecmark;
  -   enum ip_conntrack_info ctinfo;
  +   u32 *psecmark;
  +   u32 secmark = 0;
  +   enum ip_conntrack_info ctinfo;
  
  -   connsecmark = nf_ct_get_secmark(skb, ctinfo);
  -   if (connsecmark  *connsecmark)
  -   if (skb-secmark != *connsecmark)
  -   skb-secmark = *connsecmark;
  -   }
  +   psecmark = nf_ct_get_secmark(skb, ctinfo);
  +   if (psecmark)
  +   secmark = *psecmark;
  +
  +   if (!secmark)
  +   return XT_CONTINUE;
  +
  +   /* Set secmark on inbound and filter it on outbound */
  +   if (hooknum == NF_IP_POST_ROUTING || hooknum == 
 NF_IP6_POST_ROUTING) {
  +   if (!security_skb_netfilter_check(skb, secmark))
  +   return NF_DROP;
  +   } else
  +   if (skb-secmark != secmark)
  +   skb-secmark = secmark;
  +
  +   return XT_CONTINUE;
  }
 
 Quite a lot of logic has changed here.
 
 With the original code, we only restored a secmark once for 
 the lifetime 
 of a packet or connetcion (to make behavior deterministic and 
 security 
 marks immutable in the face of arbitrarily complex iptables rules).
 
 With your patch, secmarks are always writable.

Hopefully the following thread addressed these concerns.
http://marc.theaimsgroup.com/?l=selinuxm=115870100405571w=2

 
 What about packets on the OUTPUT hook?

I will check for OUTPUT as well as POSTROUTING to kickoff skb_flow_out().

 
 Also, we did not restore a 'null' (zero) secmark to the skb 
 (while this 
 should never happen with the current SECMARK target, there may be 
 non-SELinux extensions later which set a null marking).

How do you envision this (i.e. resoring a null secmark) being useful?
secmark is anyway zero by default (when no labeling rules exist for the
connection) right?

 
 Why not just do something like:
 
 
   psecmark = nf_ct_get_secmark(skb, ctinfo);
   if (psecmark  *psecmark) {
 
   ... core of function ...
 
   }
 
   return XT_CONTINUE;
 
 I don't think you need the new secmark variable.

Will do.

 
 You've also changed the logic for the dummy case of 
 security_skb_netfilter_check()

I am not getting this. This is a new function. Did you mean
to point to a different function?

 
 
 +static inline int security_skb_netfilter_check(struct sk_buff *skb,
 +   u32 nf_secid)
 +{
 +   return 1;
 +}
 +
 
 This code does not now behave as it did originally.  Keep in 
 mind that 
 SELinux is not the only user of SECMARK.

Missed this as well (this is a new function in this patch). Please
elaborate.
 
 (The documentation of the hook in security.h doesn't match 
 the behavior, 
 either -- it's (re-)labeling, not just filtering).

Will fix this.

 
 I really don't know if connection tracking is the right place 
 to be doing 
 policy enforcment, either.  Perhaps you should just do the 
 relabeling here 
 and enforcement later.

We could have done enforcement, in the SELinux postroute_last
hook for example, if only there were a place to hold onto the
exit point context, separate from the label already associated
with the skb in the secmark field. postroute_last would need BOTH
the label of the skb (available in the secmark field) and the
exit point context to do enforcement.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Remove powerpc specific parts of 3c509 driver

2006-09-20 Thread Segher Boessenkool
Sure, PCI busses are little-endian.  But is readX()/writeX() for  
PCI

only?


Yes.

For other buses, use foo_writel(), etc.

Can this please be documented then?  Never heard this before...


You have come late to the party.


WHat do you mean here?  Could you please explain?


This has been the case for many, many years.


No, it was never documented AFAICS.


And there is no point in a massive rename to pci_writel(), either.


That would be really inconvenient, sure.  It's also inconvenient
that all the nice short names are PCI-only.


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] tcp: default congestion control menu

2006-09-20 Thread Stephen Hemminger
Change how default TCP congestion control is chosen. Don't just use
last installed module, instead allow selection during configuration,
and make sure and use the default regardless of load order.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 net/ipv4/Kconfig   |   45 -
 net/ipv4/sysctl_net_ipv4.c |6 ++
 net/ipv4/tcp_cong.c|2 +-
 3 files changed, 47 insertions(+), 6 deletions(-)

--- net-2.6.19.orig/net/ipv4/Kconfig2006-09-19 16:13:02.0 -0700
+++ net-2.6.19/net/ipv4/Kconfig 2006-09-20 11:17:45.0 -0700
@@ -447,7 +447,7 @@
depends on INET_DIAG
def_tristate INET_DIAG
 
-config TCP_CONG_ADVANCED
+menuconfig TCP_CONG_ADVANCED
bool TCP: advanced congestion control
---help---
  Support for selection of various TCP congestion control
@@ -458,9 +458,7 @@
 
  If unsure, say N.
 
-# TCP Reno is builtin (required as fallback)
-menu TCP congestion control
-   depends on TCP_CONG_ADVANCED
+if TCP_CONG_ADVANCED
 
 config TCP_CONG_BIC
tristate Binary Increase Congestion (BIC) control
@@ -573,12 +571,49 @@
loss packets.
See http://www.ntu.edu.sg/home5/ZHOU0022/papers/CPFu03a.pdf
 
-endmenu
+choice
+   prompt Default TCP congestion control
+   default DEFAULT_BIC
+   help
+ Select the TCP congestion control that will be used by default
+ for all connections.
+
+   config DEFAULT_BIC
+   bool Bic if TCP_CONG_BIC=y
+
+   config DEFAULT_CUBIC
+   bool Cubic if TCP_CONG_CUBIC=y
+
+   config DEFAULT_HTCP
+   bool Htcp if TCP_CONG_HTCP=y
+
+   config DEFAULT_VEGAS
+   bool Vegas if TCP_CONG_VEGAS=y
+
+   config DEFAULT_WESTWOOD
+   bool Westwood if TCP_CONG_WESTWOOD=y
+
+   config DEFAULT_RENO
+   bool Reno
+
+endchoice
+
+endif
 
 config TCP_CONG_BIC
tristate
depends on !TCP_CONG_ADVANCED
default y
 
+config DEFAULT_TCP_CONG
+   string
+   default bic if DEFAULT_BIC
+   default cubic if DEFAULT_CUBIC
+   default htcp if DEFAULT_HTCP
+   default vegas if DEFAULT_VEGAS
+   default westwood if DEFAULT_WESTWOOD
+   default reno if DEFAULT_RENO
+   default bic
+
 source net/ipv4/ipvs/Kconfig
 
--- net-2.6.19.orig/net/ipv4/sysctl_net_ipv4.c  2006-09-19 16:13:02.0 
-0700
+++ net-2.6.19/net/ipv4/sysctl_net_ipv4.c   2006-09-19 16:13:05.0 
-0700
@@ -129,6 +129,12 @@
return ret;
 }
 
+static int __init tcp_congestion_default(void)
+{
+   return tcp_set_default_congestion_control(CONFIG_DEFAULT_TCP_CONG);
+}
+
+late_initcall(tcp_congestion_default);
 
 ctl_table ipv4_table[] = {
 {
--- net-2.6.19.orig/net/ipv4/tcp_cong.c 2006-09-19 16:13:02.0 -0700
+++ net-2.6.19/net/ipv4/tcp_cong.c  2006-09-19 16:13:05.0 -0700
@@ -48,7 +48,7 @@
printk(KERN_NOTICE TCP %s already registered\n, ca-name);
ret = -EEXIST;
} else {
-   list_add_rcu(ca-list, tcp_cong_list);
+   list_add_tail_rcu(ca-list, tcp_cong_list);
printk(KERN_INFO TCP %s registered\n, ca-name);
}
spin_unlock(tcp_cong_list_lock);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] tcp: make cubic the default

2006-09-20 Thread Stephen Hemminger
Change default congestion control used from BIC to the newer CUBIC
which it the successor to BIC but has better properties over long delay links.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 net/ipv4/Kconfig |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

--- net-test.orig/net/ipv4/Kconfig  2006-09-20 12:22:06.0 -0700
+++ net-test/net/ipv4/Kconfig   2006-09-20 13:31:21.0 -0700
@@ -454,7 +454,7 @@
  modules.
 
  Nearly all users can safely say no here, and a safe default
- selection will be made (BIC-TCP with new Reno as a fallback).
+ selection will be made (CUBIC with new Reno as a fallback).
 
  If unsure, say N.
 
@@ -462,7 +462,7 @@
 
 config TCP_CONG_BIC
tristate Binary Increase Congestion (BIC) control
-   default y
+   default m
---help---
BIC-TCP is a sender-side only change that ensures a linear RTT
fairness under large windows while offering both scalability and
@@ -476,7 +476,7 @@
 
 config TCP_CONG_CUBIC
tristate CUBIC TCP
-   default m
+   default y
---help---
This is version 2.0 of BIC-TCP which uses a cubic growth function
among other techniques.
@@ -573,7 +573,7 @@
 
 choice
prompt Default TCP congestion control
-   default DEFAULT_BIC
+   default DEFAULT_CUBIC
help
  Select the TCP congestion control that will be used by default
  for all connections.
@@ -600,7 +600,7 @@
 
 endif
 
-config TCP_CONG_BIC
+config TCP_CONG_CUBIC
tristate
depends on !TCP_CONG_ADVANCED
default y
@@ -613,7 +613,7 @@
default vegas if DEFAULT_VEGAS
default westwood if DEFAULT_WESTWOOD
default reno if DEFAULT_RENO
-   default bic
+   default cubic
 
 source net/ipv4/ipvs/Kconfig
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/7] secid reconciliation-v02: Enforcement for SELinux

2006-09-20 Thread Venkat Yekkirala
  +static int selinux_skb_policy_check(struct sk_buff *skb, 
 unsigned short
  family) +{
  +   u32 xfrm_sid, trans_sid;
  +   int err;
  +
  +   if (selinux_compat_net)
  +   return 1;
  +
  +   err = selinux_xfrm_decode_session(skb, xfrm_sid, 0);
  +   BUG_ON(err);
 
 First, any reason against including the struct sock * in 
 the LSM hook?  At a 
 quick glance it looks like it is available at each place 
 security_skb_policy_check() is invoked?  If there are no 
 objections I would 
 like to see it included in the hook.

There's no sock available (NULL) for forward, no-sock, time-wait cases, etc.

What you are trying to accomplish with the sock here anyway?

 
 Second, I wonder if it would be better to do a NetLabel/CIPSO 
 query here using 
 the xfrm_sid as the NetLabel base_sid instead of at the end 
 of the function 
 (see your comment)?  This way we wouldn't have to duplicate the 
 avc_has_perm() and security_transition_sid() calls for both xfrm and 
 NetLabel.

There's a need for an additional avc_has_perm check anyway between
the cipso label and the ipsec/transition label, to check to make sure
the cipso level falls within the range on the IPSec/transition SA.

No need for a new transition between ipsec/transition label and the cipso
label since the cipso label would be sharing the TE portion with the
ipsec/transition label (this could change in the future, when you get round
to doing entire SELinux contexts over the wire). For now, you would just
set the secmark to the cipso label if the label could come thru
(i.e. if the avc_has_perm succeeds).
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Remove powerpc specific parts of 3c509 driver

2006-09-20 Thread Jeff Garzik

Segher Boessenkool wrote:

Sure, PCI busses are little-endian.  But is readX()/writeX() for PCI
only?


Yes.

For other buses, use foo_writel(), etc.

Can this please be documented then?  Never heard this before...


You have come late to the party.


WHat do you mean here?  Could you please explain?


This has been the case for many, many years.


No, it was never documented AFAICS.


A de facto standard does not need to be documented, to be a de facto 
standard.


A lot of Linux standards are often based on emails from Linus buried 
halfway down a thread.  A decision gets made, and people follow.




And there is no point in a massive rename to pci_writel(), either.


That would be really inconvenient, sure.  It's also inconvenient
that all the nice short names are PCI-only.


Only to you, a decided minority of developers.

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Update of the r8169 branch

2006-09-20 Thread Francois Romieu
(adding netdev to Cc: so that the patch gets publically known)

Boris B. Zhmurov [EMAIL PROTECTED] :
[...]
 Hello Francois. I've figured out, that this patch wasn't merged in 
 linux-2.6.18 :(

Bad timing. Patches are available.

 Is there any plans to merge it in mainline ?

Jeff pulled most of the r8169 branch. I can't answer for him but
I guess that the answer is RSN.

 And is there any patches available against linux-2.6.18 release?

The content of the r8169 branch in the git repository can be retrieved at:

http://www.fr.zoreil.com/people/francois/misc/20060920-2.6.18-r8169-test.patch

It should support 8167 as is.

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sky2: process tx pause frames.

2006-09-20 Thread Jeff Garzik

Stephen Hemminger wrote:

This patch already is in 2.6.17 stable, but the bigger version was pushed
off till 2.6.19. Here is a less intrusive version that needs to go into 2.6.18
(or I'll end up sending it for 2.6.18.1). The driver was telling the
GMAC to flush (not process) pause frames. Manually disabling pause wasn't
working because of problems in the setup.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


You'll need to send this to [EMAIL PROTECTED]

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sky2: process tx pause frames.

2006-09-20 Thread Stephen Hemminger
On Wed, 20 Sep 2006 17:49:41 -0400
Jeff Garzik [EMAIL PROTECTED] wrote:

 Stephen Hemminger wrote:
  This patch already is in 2.6.17 stable, but the bigger version was pushed
  off till 2.6.19. Here is a less intrusive version that needs to go into 
  2.6.18
  (or I'll end up sending it for 2.6.18.1). The driver was telling the
  GMAC to flush (not process) pause frames. Manually disabling pause wasn't
  working because of problems in the setup.
  
  Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
 
 You'll need to send this to [EMAIL PROTECTED]
 
   Jeff

I did already thanks.
-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] secid reconciliation-v02: Enforcement for SELinux

2006-09-20 Thread Paul Moore
Venkat Yekkirala wrote:
+static int selinux_skb_policy_check(struct sk_buff *skb, 

unsigned short

family) +{
+u32 xfrm_sid, trans_sid;
+int err;
+
+if (selinux_compat_net)
+return 1;
+
+err = selinux_xfrm_decode_session(skb, xfrm_sid, 0);
+BUG_ON(err);

First, any reason against including the struct sock * in 
the LSM hook?  At a 
quick glance it looks like it is available at each place 
security_skb_policy_check() is invoked?  If there are no 
objections I would 
like to see it included in the hook.
  
 There's no sock available (NULL) for forward, no-sock, time-wait cases, etc.

... which would be why I should have taken a closer look :)

 What you are trying to accomplish with the sock here anyway?

Actually this is no longer an issue because of something else - you can
ignore this now.

-- 
paul moore
linux security @ hp
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-20 Thread Stephen Hemminger
On Mon, 18 Sep 2006 06:56:55 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Alexey Kuznetsov [EMAIL PROTECTED]
 Date: Mon, 18 Sep 2006 14:37:05 +0400
 
   It looks perfectly fine to me, would you like me to apply it
   Alexey?
  
  Yes, I think it is safe.
 
 Ok, I'll put this into net-2.6.19 for now.  Thanks.

Did you try this on a desktop system?  Something is wrong with net-2.6.19
basic web browsing seems slower.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-20 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Wed, 20 Sep 2006 15:44:06 -0700

 On Mon, 18 Sep 2006 06:56:55 -0700 (PDT)
 David Miller [EMAIL PROTECTED] wrote:
 
  Ok, I'll put this into net-2.6.19 for now.  Thanks.
 
 Did you try this on a desktop system?  Something is wrong with net-2.6.19
 basic web browsing seems slower.

It might be due to other changes, please verify that it's
truly caused by Alexey's change by backing it out and
retesting.

Note that I had to use an updated version of Alexey's change,
which he sent me privately, because the first version didn't
compile :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-20 Thread Stephen Hemminger
On Wed, 20 Sep 2006 15:47:56 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Wed, 20 Sep 2006 15:44:06 -0700
 
  On Mon, 18 Sep 2006 06:56:55 -0700 (PDT)
  David Miller [EMAIL PROTECTED] wrote:
  
   Ok, I'll put this into net-2.6.19 for now.  Thanks.
  
  Did you try this on a desktop system?  Something is wrong with net-2.6.19
  basic web browsing seems slower.
 
 It might be due to other changes, please verify that it's
 truly caused by Alexey's change by backing it out and
 retesting.
 
 Note that I had to use an updated version of Alexey's change,
 which he sent me privately, because the first version didn't
 compile :)

It might be something else.. there are a lot of changes from 2.6.18 to 
net-2.6.19.



-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/6] Make splice_to_pipe non-static and move structure definitions to a header file

2006-09-20 Thread Ashwini Kulkarni

---

 fs/splice.c   |   18 +-
 include/linux/pipe_fs_i.h |   18 ++
 2 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 684bca3..c6a880b 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -29,22 +29,6 @@
 #include linux/syscalls.h
 #include linux/uio.h
 
-struct partial_page {
-   unsigned int offset;
-   unsigned int len;
-};
-
-/*
- * Passed to splice_to_pipe
- */
-struct splice_pipe_desc {
-   struct page **pages;/* page map */
-   struct partial_page *partial;   /* pages[] may not be contig */
-   int nr_pages;   /* number of pages in map */
-   unsigned int flags; /* splice flags */
-   struct pipe_buf_operations *ops;/* ops associated with output pipe */
-};
-
 /*
  * Attempt to steal a page from a pipe buffer. This should perhaps go into
  * a vm helper function, it's already simplified quite a bit by the
@@ -173,7 +157,7 @@ static struct pipe_buf_operations user_p
  * Pipe output worker. This sets up our pipe format with the page cache
  * pipe buffer operations. Otherwise very similar to the regular pipe_writev().
  */
-static ssize_t splice_to_pipe(struct pipe_inode_info *pipe,
+ssize_t splice_to_pipe(struct pipe_inode_info *pipe,
  struct splice_pipe_desc *spd)
 {
int ret, do_wakeup, page_nr;
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index ea4f7cd..9067985 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -100,4 +100,22 @@ extern ssize_t splice_from_pipe(struct p
loff_t *, size_t, unsigned int,
splice_actor *);
 
+struct partial_page {
+   unsigned int offset;
+   unsigned int len;
+};
+
+/*
+ * Passed to splice_to_pipe
+ */
+struct splice_pipe_desc {
+   struct page **pages;/* page map */
+   struct partial_page *partial;   /* pages[] may not be contig */
+   int nr_pages;   /* number of pages in map */
+   unsigned int flags; /* splice flags */
+   struct pipe_buf_operations *ops;/* ops associated with output pipe */
+};
+
+ssize_t splice_to_pipe(struct pipe_inode_info *, struct splice_pipe_desc *);
+
 #endif

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 2/6] Make sock_def_wakeup non-static

2006-09-20 Thread Ashwini Kulkarni

---

 include/net/sock.h |1 +
 net/core/sock.c|3 ++-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 324b3ea..3a64262 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -497,6 +497,7 @@ extern void sk_stream_wait_close(struct 
 extern int sk_stream_error(struct sock *sk, int flags, int err);
 extern void sk_stream_kill_queues(struct sock *sk);
 
+extern void sock_def_wakeup(struct sock *sk);
 extern int sk_wait_data(struct sock *sk, long *timeo);
 
 struct request_sock_ops;
diff --git a/net/core/sock.c b/net/core/sock.c
index 51fcfbc..8496854 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1400,7 +1400,7 @@ ssize_t sock_no_sendpage(struct socket *
  * Default Socket Callbacks
  */
 
-static void sock_def_wakeup(struct sock *sk)
+void sock_def_wakeup(struct sock *sk)
 {
read_lock(sk-sk_callback_lock);
if (sk-sk_sleep  waitqueue_active(sk-sk_sleep))
@@ -1961,6 +1961,7 @@ EXPORT_SYMBOL(sock_no_poll);
 EXPORT_SYMBOL(sock_no_recvmsg);
 EXPORT_SYMBOL(sock_no_sendmsg);
 EXPORT_SYMBOL(sock_no_sendpage);
+EXPORT_SYMBOL(sock_def_wakeup);
 EXPORT_SYMBOL(sock_no_setsockopt);
 EXPORT_SYMBOL(sock_no_shutdown);
 EXPORT_SYMBOL(sock_no_socketpair);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 3/6] Add in TCP related part of splice read to ipv4

2006-09-20 Thread Ashwini Kulkarni

---

 net/ipv4/af_inet.c |1 
 net/ipv4/tcp.c |  135 
 2 files changed, 136 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index c84a320..3c0d245 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -807,6 +807,7 @@ const struct proto_ops inet_stream_ops =
.recvmsg   = sock_common_recvmsg,
.mmap  = sock_no_mmap,
.sendpage  = tcp_sendpage,
+   .splice_read   = tcp_splice_read,
 #ifdef CONFIG_COMPAT
.compat_setsockopt = compat_sock_common_setsockopt,
.compat_getsockopt = compat_sock_common_getsockopt,
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 934396b..d4c02a1 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -254,6 +254,10 @@
 #include linux/init.h
 #include linux/smp_lock.h
 #include linux/fs.h
+#include linux/skbuff.h
+#include linux/pipe_fs_i.h
+#include linux/net.h
+#include linux/socket.h
 #include linux/random.h
 #include linux/bootmem.h
 #include linux/cache.h
@@ -264,6 +268,7 @@
 #include net/xfrm.h
 #include net/ip.h
 #include net/netdma.h
+#include net/sock.h
 
 #include asm/uaccess.h
 #include asm/ioctls.h
@@ -291,6 +296,23 @@ EXPORT_SYMBOL(tcp_memory_allocated);
 EXPORT_SYMBOL(tcp_sockets_allocated);
 
 /*
+ * Create a TCP splice context.
+ */
+struct tcp_splice_state {
+   struct pipe_inode_info *pipe;
+   void (*original_data_ready)(struct sock*, int);
+   size_t len;
+   size_t offset;
+   unsigned int flags;
+};
+
+int __tcp_splice_read(struct sock *sk, loff_t *ppos, struct pipe_inode_info 
*pipe,
+ size_t len, unsigned int flags, struct tcp_splice_state 
*tss);
+int tcp_splice_data_recv(read_descriptor_t *rd_desc, struct sk_buff *skb,
+unsigned int offset, size_t len);
+void tcp_splice_data_ready(struct sock *sk, int flag);
+
+/*
  * Pressure flag: try to collapse.
  * Technical note: it is used by multiple contexts non atomically.
  * All the sk_stream_mem_schedule() is of this nature: accounting
@@ -499,6 +521,118 @@ static inline void tcp_push(struct sock 
}
 }
 
+/*
+ *  tcp_splice_read - splice data from TCP socket to a pipe
+ * @sock:  socket to splice from
+ * @pipe:  pipe to splice to
+ * @len:   number of bytes to splice
+ * @flags: splice modifier flags
+ *
+ * Will read pages from given socket and fill them into a pipe.
+ */
+ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos, struct 
pipe_inode_info *pipe, size_t len, unsigned int flags)
+{
+   struct tcp_splice_state tss = {
+   .pipe = pipe,
+   .len = len,
+   .flags = flags,
+   };
+   struct sock *sk = sock-sk;
+   ssize_t spliced;
+   int ret;
+
+   ret = 0;
+   spliced = 0;
+
+   if (*ppos != 0)
+   return -EINVAL;
+
+   while(tss.len) {
+   ret = __tcp_splice_read(sk, ppos, tss.pipe, tss.len, tss.flags, 
tss);
+
+   if(ret  0)
+   break;
+   else if (!ret) {
+   if (spliced)
+   break;
+   if (flags  SPLICE_F_NONBLOCK) {
+   ret = -EAGAIN;
+   break;
+   }
+   }
+   tss.len -= ret;
+   spliced += ret;
+   }
+   if (spliced)
+   return spliced;
+
+   return ret;
+}
+
+int __tcp_splice_read(struct sock *sk, loff_t *ppos, struct pipe_inode_info 
*pipe, size_t len, unsigned int flags, struct tcp_splice_state *tss)
+{
+   read_descriptor_t rd_desc;
+   int copied;
+
+   tss-original_data_ready = sk-sk_data_ready;
+
+   sk-sk_user_data = tss;
+
+   /* Store TCP splice context information in read_descriptor_t. */
+   rd_desc.arg.data = tss;
+
+   copied = tcp_read_sock(sk, rd_desc, tcp_splice_data_recv);
+
+   if (copied != 0) {
+   if (flags  SPLICE_F_MORE) {
+   /* Setup new sk_data_ready as tcp_splice_data_ready. */
+   sk-sk_data_ready = tcp_splice_data_ready;
+   return sk_wait_data(sk, sk-sk_rcvtimeo);
+   }
+   else if(flags  SPLICE_F_NONBLOCK)
+   return -EAGAIN;
+   else return copied;
+   }
+   else
+   return copied;
+}
+
+int tcp_splice_data_recv(read_descriptor_t *rd_desc, struct sk_buff *skb, 
unsigned int offset, size_t len)
+{
+   /*
+* Restore TCP splice context from read_descriptor_t
+*/
+   struct tcp_splice_state *tss = rd_desc-arg.data;
+
+   return skb_splice_bits(skb, offset, tss-pipe, tss-len, tss-flags);
+}
+
+void tcp_splice_data_ready(struct sock *sk, int flag)
+{
+   /*
+* Restore splice context/ read_descriptor_t from 

[RFC 4/6] Add TCP socket splicing (tcp_splice_read) support

2006-09-20 Thread Ashwini Kulkarni

---

 fs/splice.c   |   16 
 include/linux/net.h   |2 ++
 include/linux/pipe_fs_i.h |1 +
 include/net/tcp.h |3 +++
 net/socket.c  |   13 +
 5 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index c6a880b..3a4202d 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -123,6 +123,12 @@ error:
return err;
 }
 
+void generic_sock_buf_release(struct pipe_inode_info *pipe,
+   struct pipe_buffer *buf)
+{
+   put_page(buf-page);
+}
+
 static struct pipe_buf_operations page_cache_pipe_buf_ops = {
.can_merge = 0,
.map = generic_pipe_buf_map,
@@ -133,6 +139,16 @@ static struct pipe_buf_operations page_c
.get = generic_pipe_buf_get,
 };
 
+static struct pipe_buf_operations sock_buf_ops = {
+   .can_merge = 0,
+   .map = generic_pipe_buf_map,
+   .unmap = generic_pipe_buf_unmap,
+   .pin = generic_pipe_buf_pin,
+   .release = generic_sock_buf_release,
+   .steal = generic_pipe_buf_steal,
+   .get = generic_pipe_buf_get,
+};
+
 static int user_page_pipe_buf_steal(struct pipe_inode_info *pipe,
struct pipe_buffer *buf)
 {
diff --git a/include/linux/net.h b/include/linux/net.h
index b20c53c..65dfe0c 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -164,6 +164,8 @@ struct proto_ops {
  struct vm_area_struct * vma);
ssize_t (*sendpage)  (struct socket *sock, struct page *page,
  int offset, size_t size, int flags);
+   ssize_t (*splice_read)(struct socket *sock,  loff_t *ppos,
+  struct pipe_inode_info *pipe, size_t 
len, unsigned int flags);
 };
 
 struct net_proto_family {
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index 9067985..f7f439b 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -72,6 +72,7 @@ void generic_pipe_buf_get(struct pipe_in
 int generic_pipe_buf_pin(struct pipe_inode_info *, struct pipe_buffer *);
 int generic_pipe_buf_steal(struct pipe_inode_info *, struct pipe_buffer *);
 
+void generic_sock_buf_release(struct pipe_inode_info *, struct pipe_buffer *);
 /*
  * splice is tied to pipes as a transport (at least for now), so we'll just
  * add the splice flags here.
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7a093d0..5032501 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -300,6 +300,9 @@ extern void tcp_cleanup_rbuf(struct so
 extern int tcp_twsk_unique(struct sock *sk,
struct sock *sktw, void *twp);
 
+extern ssize_t tcp_splice_read(struct socket *sk, loff_t *ppos,
+   struct pipe_inode_info *pipe, 
size_t len, unsigned int flags);
+
 static inline void tcp_dec_quickack_mode(struct sock *sk,
 const unsigned int pkts)
 {
diff --git a/net/socket.c b/net/socket.c
index 6d261bf..8a4f602 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -117,6 +117,8 @@ static ssize_t sock_writev(struct file *
  unsigned long count, loff_t *ppos);
 static ssize_t sock_sendpage(struct file *file, struct page *page,
 int offset, size_t size, loff_t *ppos, int more);
+static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
+   struct pipe_inode_info *pipe, size_t len, 
unsigned int flags);
 
 /*
  * Socket files have a set of 'special' operations as well as the generic 
file ones. These don't appear
@@ -141,6 +143,7 @@ static struct file_operations socket_fil
.writev =   sock_writev,
.sendpage = sock_sendpage,
.splice_write = generic_splice_sendpage,
+   .splice_read =  sock_splice_read,
 };
 
 /*
@@ -701,6 +704,16 @@ static ssize_t sock_sendpage(struct file
return sock-ops-sendpage(sock, page, offset, size, flags);
 }
 
+static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
+   struct pipe_inode_info *pipe, size_t len, 
unsigned int flags)
+{
+   struct socket *sock;
+
+   sock = file-private_data;
+
+   return sock-ops-splice_read(sock, ppos, pipe, len, flags);
+}
+
 static struct sock_iocb *alloc_sock_iocb(struct kiocb *iocb,
char __user *ubuf, size_t size, struct sock_iocb *siocb)
 {

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/6] TCP socket splice

2006-09-20 Thread Ashwini Kulkarni

My name is Ashwini Kulkarni and I have been working at Intel Corporation for
the past 4 months as an engineering intern. I have been working on the 'TCP
socket splice' project with Chris Leech. This is a work-in-progress version
of the project with scope for further modifications.

TCP socket splicing:
It allows a TCP socket to be spliced to a file via a pipe buffer. First, to
splice data from a socket to a pipe buffer, upto 16 source pages(s) are pulled
into the pipe buffer. Then to splice data from the pipe buffer to a file,
those pages are migrated into the address space of the target file. It takes
place entirely within the kernel and thus results in zero memory copies. It is
the receive side complement to sendfile() but unlike sendfile() it is
possible to splice from a socket as well and not just to a socket.

Current Method:
 +   Application Buffer +
 |   |
_|___|_
 |   |
  Receive or |   | Write
  I/OAT DMA  |   |
 |   |
 |   V
   Network  File System
   Buffer  Buffer
 ^   |
 |   |
_|___|_
 DMA |   | DMA
 |   |
   Hardware  |   |
 |   V
NIC SATA

In the current method, the packet is DMA’d from the NIC into the network 
buffer.
There is a read on socket to the user space and the packet data is copied from
the network buffer to the application buffer. A write operation then moves the
data from the application buffer to the file system buffer which is then DMA'd
to the disk again. Thus, in the current method there will be one full copy of
all the data to the user space.

Using TCP socket splice:

Application Control
 |
_|__
 |
 |   TCP socket splice
 | +-+
 | | Direct path |
 V | V
   Network  File System
   Buffer  Buffer
 ^   |
 |   |
_|___|__
 DMA |   | DMA
 |   |
   Hardware  |   |
 |   V
NIC SATA

In this method, the objective is to use TCP socket splicing to create a direct
path in the kernel from the network buffer to the file system buffer via a pipe
buffer. The pages will migrate from the network buffer (which is associated
with the socket) into the pipe buffer for an optimized path. From the pipe
buffer, the pages will then be migrated to the output file address space page
cache. This will enable to create a LAN to file-system API which will avoid the
memcpy operations in user space and thus create a fast path from the network
buffer to the storage buffer.

Open Issues (currently being addressed):
There is a performance drop when transferring bigger files (usually larger than
65536 bytes in size). Performance drop increases with the size of the file.
Work is in progress to identify the source of this issue.

We encourage the community to review our TCP socket splice project. Feedback
would be greatly appreciated.

--
Ashwini Kulkarni
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 5/6] Add skb_splice_bits to skbuff.c

2006-09-20 Thread Ashwini Kulkarni

---

 include/linux/skbuff.h |2 +
 net/core/skbuff.c  |  137 
 2 files changed, 139 insertions(+), 0 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 755e9cd..8f4b90e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1338,6 +1338,8 @@ extern unsigned intskb_checksum(cons
int len, unsigned int csum);
 extern intskb_copy_bits(const struct sk_buff *skb, int offset,
 void *to, int len);
+extern intskb_splice_bits(const struct sk_buff *skb, int offset,
+struct pipe_inode_info *pipe, int len, 
unsigned int flags);
 extern intskb_store_bits(const struct sk_buff *skb, int offset,
  void *from, int len);
 extern unsigned intskb_copy_and_csum_bits(const struct sk_buff *skb,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index c54f366..a92d165 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -53,6 +53,7 @@
 #endif
 #include linux/string.h
 #include linux/skbuff.h
+#include linux/pipe_fs_i.h
 #include linux/cache.h
 #include linux/rtnetlink.h
 #include linux/init.h
@@ -70,6 +71,17 @@
 static kmem_cache_t *skbuff_head_cache __read_mostly;
 static kmem_cache_t *skbuff_fclone_cache __read_mostly;
 
+/* Pipe buffer operations for a socket. */
+static struct pipe_buf_operations sock_buf_ops = {
+   .can_merge = 0,
+   .map = generic_pipe_buf_map,
+   .unmap = generic_pipe_buf_unmap,
+   .pin = generic_pipe_buf_pin,
+   .release = generic_sock_buf_release,
+   .steal = generic_pipe_buf_steal,
+   .get = generic_pipe_buf_get,
+};
+
 /*
  * Keep out-of-line to prevent kernel bloat.
  * __builtin_return_address is not used because it is not always
@@ -1148,6 +1160,131 @@ fault:
return -EFAULT;
 }
 
+/* Move specified number of bytes from the source skb to the
+ * destination pipe buffer. This function even handles all the
+ * bits of traversing fragment lists.
+ */
+int skb_splice_bits(const struct sk_buff *skb, int offset, struct 
pipe_inode_info *pipe, int len, unsigned int flags)
+{
+   struct page *page;
+   struct partial_page partial[PIPE_BUFFERS];
+   struct page *pages[PIPE_BUFFERS];
+   int buflen, available_len;
+   int pg_nr = 0;
+   int i, nfrags;
+   void *address;
+   size_t ret = 0;
+   struct splice_pipe_desc spd = {
+   .pages = pages,
+   .partial = partial,
+   .flags = flags,
+   .ops = sock_buf_ops,
+   };
+
+   buflen = skb_headlen(skb);
+
+   if ((available_len = buflen - offset) 0) {
+   if (available_len  len)
+   available_len = len;
+
+   page = alloc_page(GFP_KERNEL);
+   if (!page)
+   return -ENOMEM;
+
+   address = kmap(page);
+   memcpy(address, skb-data + offset, 
available_len);
+   /* Push page into splice pipe desc. */
+   spd.pages[pg_nr] = page;
+   pg_nr++;
+   kunmap(page);
+
+   /* If entire length has been consumed or number 
of pages pushed into
+* splice pipe desc(pipe buffer) equals 16, 
then call splice_to_pipe.
+*/
+   if (((len -= available_len) == 0) || pg_nr == 
PIPE_BUFFERS) {
+   spd.nr_pages = pg_nr;
+   offset += available_len;
+ret = splice_to_pipe(pipe, spd);
+if (ret == -EPIPE)
+return -EPIPE;
+else if (ret == -EAGAIN)
+return -EAGAIN;
+else if (ret == -ERESTARTSYS)
+return -ERESTARTSYS;
+else goto frags;
+   }
+   }
+   frags:
+   if (skb_shinfo(skb)-nr_frags != 0) {
+   nfrags = skb_shinfo(skb)-nr_frags;
+
+   for (i = 0; i  nfrags; i++) {
+   int total;
+   skb_frag_t *frag = 
skb_shinfo(skb)-frags[i];
+   
get_page(skb_shinfo(skb)-frags[i].page);
+
+   total = buflen + 
skb_shinfo(skb)-frags[i].size;
+
+   if 

[RFC 6/6] Move i_size_read part from do_splice_to() to __generic_file_splice_read() in splice.c

2006-09-20 Thread Ashwini Kulkarni

---

 fs/splice.c |   18 --
 1 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 3a4202d..2f8f42a 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -271,7 +271,7 @@ __generic_file_splice_read(struct file *
struct partial_page partial[PIPE_BUFFERS];
struct page *page;
pgoff_t index, end_index;
-   loff_t isize;
+   loff_t isize, left;
size_t total_len;
int error, page_nr;
struct splice_pipe_desc spd = {
@@ -421,6 +421,13 @@ __generic_file_splice_read(struct file *
 * i_size must be checked after -readpage().
 */
isize = i_size_read(mapping-host);
+   if (unlikely(*ppos = isize))
+   return 0;
+
+   left = isize - *ppos;
+   if (unlikely(left  len))
+   len = left;
+
end_index = (isize - 1)  PAGE_CACHE_SHIFT;
if (unlikely(!isize || index  end_index))
break;
@@ -903,7 +910,6 @@ static long do_splice_to(struct file *in
 struct pipe_inode_info *pipe, size_t len,
 unsigned int flags)
 {
-   loff_t isize, left;
int ret;
 
if (unlikely(!in-f_op || !in-f_op-splice_read))
@@ -916,14 +922,6 @@ static long do_splice_to(struct file *in
if (unlikely(ret  0))
return ret;
 
-   isize = i_size_read(in-f_mapping-host);
-   if (unlikely(*ppos = isize))
-   return 0;
-   
-   left = isize - *ppos;
-   if (unlikely(left  len))
-   len = left;
-
return in-f_op-splice_read(in, ppos, pipe, len, flags);
 }
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 08/23] e1000: add multicast stats counters

2006-09-20 Thread cramerj
 Williams, Mitch A wrote:
  + { rx_broadcast, E1000_STAT(stats.bprc) },
  + { tx_broadcast, E1000_STAT(stats.bptc) },
  + { rx_multicast, E1000_STAT(stats.mprc) },
  + { tx_multicast, E1000_STAT(stats.mptc) },
{ rx_errors, E1000_STAT(net_stats.rx_errors) },
{ tx_errors, E1000_STAT(net_stats.tx_errors) },
{ tx_dropped, E1000_STAT(net_stats.tx_dropped) },
  NAK -- you also need to remove the standard net stats, which are
  exported elsewhere
 
  Jeff, can you please explain the reason for this NAK a little more?
  Neither Auke nor I understand why you rejected the patch.
 
  This patch just adds the display of a few more stats in Ethtool.  It
  doesn't affect any other counters, and is really just a convenience
  feature.  I added this to the driver because of a customer request.
 
 Adding those stats is fine.  You guys just need to remove the existing
 mess first.
 
   Jeff
 

Since we have 1-to-1 mapping of some of our statistics registers to the
net_stats, we could s/net_stats/stats/.  However, there are a few
net_stats (e.g. net_stats.rx_errors) that encapsulate more than one
e1000 statistic register of which we don't have a private stat member
defined.

For those statistics, is it really necessary to add another stat
structure just to rm net_stats from that list we pass to ethtool?  At
best, it would look something like this...

  { foo_count, E1000_STAT(stats.foo) },
- { rx_errors, E1000_STAT(net_stats.rx_errors) },
+ { rx_errors, E1000_STAT(eth_stats.rx_errors) },
  { bar_count, E1000_STAT(stats.bar) },

If so, well, OK.  I'm just scratching my head as to why it's a mess
as-is.

I've missed obvious alternatives before; care to enlighten?

-Jeb
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/23] e1000: add multicast stats counters

2006-09-20 Thread Jeff Garzik

cramerj wrote:

Williams, Mitch A wrote:

+   { rx_broadcast, E1000_STAT(stats.bprc) },
+   { tx_broadcast, E1000_STAT(stats.bptc) },
+   { rx_multicast, E1000_STAT(stats.mprc) },
+   { tx_multicast, E1000_STAT(stats.mptc) },
{ rx_errors, E1000_STAT(net_stats.rx_errors) },
{ tx_errors, E1000_STAT(net_stats.tx_errors) },
{ tx_dropped, E1000_STAT(net_stats.tx_dropped) },

NAK -- you also need to remove the standard net stats, which are
exported elsewhere

Jeff, can you please explain the reason for this NAK a little more?
Neither Auke nor I understand why you rejected the patch.



This patch just adds the display of a few more stats in Ethtool.  It
doesn't affect any other counters, and is really just a convenience
feature.  I added this to the driver because of a customer request.

Adding those stats is fine.  You guys just need to remove the existing
mess first.



Since we have 1-to-1 mapping of some of our statistics registers to the
net_stats, we could s/net_stats/stats/.  However, there are a few
net_stats (e.g. net_stats.rx_errors) that encapsulate more than one
e1000 statistic register of which we don't have a private stat member
defined.

For those statistics, is it really necessary to add another stat
structure just to rm net_stats from that list we pass to ethtool?  At
best, it would look something like this...

  { foo_count, E1000_STAT(stats.foo) },
- { rx_errors, E1000_STAT(net_stats.rx_errors) },
+ { rx_errors, E1000_STAT(eth_stats.rx_errors) },
  { bar_count, E1000_STAT(stats.bar) },

If so, well, OK.  I'm just scratching my head as to why it's a mess
as-is.


The ethtool get-stats sub ioctl has _always_ been for exporting _only_ 
NIC-private statistics.


So, no, there is no inherent connection between adding multicast stats 
and removing ones that should have never been in the list.  But if I 
don't put my foot down, this will never get corrected.


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/7] secid reconciliation-v02: Invoke LSM hook for out bound traffic

2006-09-20 Thread James Morris
On Wed, 20 Sep 2006, Venkat Yekkirala wrote:

  Quite a lot of logic has changed here.
  
  With the original code, we only restored a secmark once for 
  the lifetime 
  of a packet or connetcion (to make behavior deterministic and 
  security 
  marks immutable in the face of arbitrarily complex iptables rules).
  
  With your patch, secmarks are always writable.
 
 Hopefully the following thread addressed these concerns.
 http://marc.theaimsgroup.com/?l=selinuxm=115870100405571w=2

Ok, but can we preserve existing behavior when packet are only being 
labeled internally?

(We should probably settle on the use of 'external' for cipso/xfrm 
labeling and 'internal' for iptables only).


  Also, we did not restore a 'null' (zero) secmark to the skb 
  (while this 
  should never happen with the current SECMARK target, there may be 
  non-SELinux extensions later which set a null marking).
 
 How do you envision this (i.e. resoring a null secmark) being useful?
 secmark is anyway zero by default (when no labeling rules exist for the
 connection) right?

Actually, don't worry about this.  The implementation can decide what a 
'null' mark might be and manage it themselves.


  You've also changed the logic for the dummy case of 
  security_skb_netfilter_check()
 
 I am not getting this. This is a new function. Did you mean
 to point to a different function?
 
  
  
  +static inline int security_skb_netfilter_check(struct sk_buff *skb,
  +   u32 nf_secid)
  +{
  +   return 1;
  +}
  +
  
  This code does not now behave as it did originally.  Keep in 
  mind that 
  SELinux is not the only user of SECMARK.

I'm talking about the code as a whole and the way this hook does not 
preserve existing behavior in the default case.

Look at the original code:

static void secmark_restore(struct sk_buff *skb)
{
if (!skb-secmark) {
u32 *connsecmark;
enum ip_conntrack_info ctinfo;

connsecmark = nf_ct_get_secmark(skb, ctinfo);
if (connsecmark  *connsecmark)
if (skb-secmark != *connsecmark)
skb-secmark = *connsecmark;
}
}

Now, you have added an LSM hook in here:

+   /* Set secmark on inbound and filter it on outbound */
+   if (hooknum == NF_IP_POST_ROUTING || hooknum == NF_IP6_POST_ROUTING) {
+   if (!security_skb_netfilter_check(skb, secmark))
+   return NF_DROP;
+   } else
+   if (skb-secmark != secmark)
+   skb-secmark = secmark;

The dummy hook does not restore the secmark in the way that the original 
code does, now depending on the hooknum.

When LSM is not configured or no LSM module is active, the behavior of the 
code must be identical to the original version.

  I really don't know if connection tracking is the right place 
  to be doing 
  policy enforcment, either.  Perhaps you should just do the 
  relabeling here 
  and enforcement later.
 
 We could have done enforcement, in the SELinux postroute_last
 hook for example, if only there were a place to hold onto the
 exit point context, separate from the label already associated
 with the skb in the secmark field. postroute_last would need BOTH
 the label of the skb (available in the secmark field) and the
 exit point context to do enforcement.

Ok, it's not pretty, but I guess it's much better than adding another 
field to the skb or similar.


- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [redhat-lspp] ipsec acquire has security context although I a m not using it.

2006-09-20 Thread Joy Latten
Venkat,

This doesn't look right since kzalloc would already have zeroed the
structure out. Are you sure you are getting garbage in the acquire
from the kernel? If you are, I strongly doubt that this would be the
one causing it (unless kzalloc on this arch misbehaved).
Or is this a racoon bug?

Yes, you are correct! Thanks for pointing this out to me
as I missed it! It is racoon that has the bug.
Will fix and post correct fix shortly. Please ignore
attached fix as it is incorrect.

Again, thanks!

Regards,
Joy

 When using ipsec while selinux is enabled in my kernel, 
 my racoon daemon fails to establish an SA. I believe the
 ACQUIRE sent from kernel has a security context although I 
 am not using this feature with ipsec. As a result, racoon
 fails to establish the SA, because it is looking for a policy
 with security context. I noticed the security context 
 contains garbage. 
 
 I am using a pseries, power5, ppc64 box, and it appears
 that since policy-security structure is not really initialized
 or zero'd out when not using, it is possible it may contain garbage
 on my pseries and a call such as if (policy-security) may 
 come back as true such that security context is included in
 my acquire message although I believe it should not be. 
 
 Hopefully, the below patch is acceptable. I have compiled and
 tested it.
 
 Regards,
 Joy Latten
 
 
 diff -urpN linux-2.6.17.orig/net/xfrm/xfrm_policy.c 
 linux-2.6.17.patch/net/xfrm/xfrm_policy.c
 --- linux-2.6.17.orig/net/xfrm/xfrm_policy.c 2006-09-19 
 02:11:33.0 -0500
 +++ linux-2.6.17.patch/net/xfrm/xfrm_policy.c2006-09-19 
 04:33:50.0 -0500
 @@ -319,6 +319,7 @@ struct xfrm_policy *xfrm_policy_alloc(gf
  init_timer(policy-timer);
  policy-timer.data = (unsigned long)policy;
  policy-timer.function = xfrm_policy_timer;
 +policy-security = NULL;
  }
  return policy;
  }
 

--
This message was distributed to subscribers of the selinux mailing list.
If you no longer wish to subscribe, send mail to [EMAIL PROTECTED] with
the words unsubscribe selinux without quotes as the message.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: UDP Out 0f Sequence

2006-09-20 Thread Majumder, Rajib
Does this mean if we have 2 hosts connected back to back (there's no network 
device in between), sequence is guaranteed even in UDP? 

-Original Message-
From: Rick Jones [mailto:[EMAIL PROTECTED]
Sent: 21 September 2006 00:47
To: Majumder, Rajib
Cc: 'netdev@vger.kernel.org'
Subject: Re: UDP Out 0f Sequence


Majumder, Rajib wrote:
 Hi,
 
 If I write UDP datagrams 1,2 and 3 to network and if the receiver
 receives in order 2,1, and 3, where can the sequence get changed? Is it
 at the source stack, network transit or destination stack?

Yes. :)

Although network transit is by far the most likely case.  Destination 
stack is a distant second and source stack an even more distant third. 
Generally stack writers try to avoid having places in their stacks where 
things can reorder, but it isn't completely unknown.

rick jones


==
Please access the attached hyperlink for an important electronic communications 
disclaimer: 

http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: UDP Out 0f Sequence

2006-09-20 Thread David Miller
From: Majumder, Rajib [EMAIL PROTECTED]
Date: Thu, 21 Sep 2006 10:50:17 +0800

 Does this mean if we have 2 hosts connected back to back (there's no
 network device in between), sequence is guaranteed even in UDP?

Not true.  Even for back to back systems SMP can cause packets
to be delivered out of order even locally within the system
on receive.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: UDP Out 0f Sequence

2006-09-20 Thread Ian McDonald

On 9/21/06, Majumder, Rajib [EMAIL PROTECTED] wrote:

Does this mean if we have 2 hosts connected back to back (there's no network 
device in between), sequence is guaranteed even in UDP?


I think if you're trying to make the packets appear in order you need
to untie the Gordian knot http://en.wikipedia.org/wiki/Gordian_Knot

In other words you should fix the application rather than the near
impossible task of trying to make the packets in order...

Ian
--
Ian McDonald
Web: http://wand.net.nz/~iam4
Blog: http://imcdnzl.blogspot.com
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 0/3 v2] Add tsi108 On chip Ethernet device driver support

2006-09-20 Thread Zang Roy-r61911
The Tundra Semiconductor Corporation (Tundra) Tsi108/9 is a host bridge
for PowerPC processors that offers numerous system interconnect options
for embedded application designers . The Tsi108/9 can interconnect 60x
or MPX processors to PCI/X peripherals, DDR2-400 memory, Gigabit
Ethernet, and Flash.

Tsi108/109 is used on powerpc/mpc7448hpc2 platform.

The following serial patches provide Tsi108/9 on chip Ethernet chip
support.

1/3 : Config and Makefile modification.
2/3 : Header file
3/3 : C body file

This serial patches fix the issues in the feedback from the previous
patches.
Feedback is welcomed.
Roy 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 3/3 v2] Add tsi108 On Chip Ethernet device driver support

2006-09-20 Thread Zang Roy-r61911
The Tundra Semiconductor Corporation (Tundra) Tsi108/9 is a host bridge
for PowerPC processors that offers numerous system interconnect options
for embedded application designers . The Tsi108/9 can interconnect 60x
or MPX processors to PCI/X peripherals, DDR2-400 memory, Gigabit
Ethernet, and Flash.
Tsi108/109 is used on powerpc/mpc7448hpc2 platform.

The following patch provides Tsi108/9 on chip Ethernet chip driver
support.

Signed-off-by: Alexandre Bounine [EMAIL PROTECTED]
Signed-off-by: Roy Zang [EMAIL PROTECTED] 


--
 drivers/net/tsi108_eth.c | 1700 ++
 1 files changed, 1700 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tsi108_eth.c b/drivers/net/tsi108_eth.c
new file mode 100644
index 000..5714f78
-- /dev/null
+++ b/drivers/net/tsi108_eth.c
@@ -0,0 +1,1700 @@
+/***
+  
+  Copyright(c) 2006 Tundra Semiconductor Corporation.
+  
+  This program is free software; you can redistribute it and/or modify it 
+  under the terms of the GNU General Public License as published by the Free 
+  Software Foundation; either version 2 of the License, or (at your option) 
+  any later version.
+  
+  This program is distributed in the hope that it will be useful, but WITHOUT 
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
+  more details.
+  
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc., 59 
+  Temple Place - Suite 330, Boston, MA  02111-1307, USA.
+
+***/
+
+/* This driver is based on the driver code originally developed
+ * for the Intel IOC80314 (ForestLake) Gigabit Ethernet by
+ * [EMAIL PROTECTED]  * Copyright (C) 2003 TimeSys Corporation
+ *
+ * Currently changes from original version are:
+ * - porting to Tsi108-based platform and kernel 2.6 ([EMAIL PROTECTED])
+ * - modifications to handle two ports independently and support for
+ *   additional PHY devices ([EMAIL PROTECTED])
+ * - Get hardware information from platform device. ([EMAIL PROTECTED])
+ *
+ */
+
+#include linux/config.h
+#include linux/module.h
+#include linux/types.h
+#include linux/init.h
+#include linux/net.h
+#include linux/netdevice.h
+#include linux/etherdevice.h
+#include linux/skbuff.h
+#include linux/slab.h
+#include linux/sched.h
+#include linux/spinlock.h
+#include linux/delay.h
+#include linux/crc32.h
+#include linux/mii.h
+#include linux/device.h
+#include linux/pci.h
+#include linux/rtnetlink.h
+#include linux/timer.h
+#include linux/platform_device.h
+
+#include asm/system.h
+#include asm/io.h
+#include asm/tsi108.h
+
+#include tsi108_eth.h
+
+#define MII_READ_DELAY 1   /* max link wait time in msec */
+
+#define TSI108_RXRING_LEN 256
+
+/* NOTE: The driver currently does not support receiving packets
+ * larger than the buffer size, so don't decrease this (unless you
+ * want to add such support).
+ */
+#define TSI108_RXBUF_SIZE 1536
+
+#define TSI108_TXRING_LEN 256
+
+#define TSI108_TX_INT_FREQ64
+
+/* Check the phy status every half a second. */
+#define CHECK_PHY_INTERVAL (HZ/2)
+
+static int tsi108_init_one(struct platform_device *pdev);
+static int tsi108_ether_remove(struct platform_device *pdev);
+
+struct tsi108_prv_data {
+   void  __iomem *regs;/* Base of normal regs */
+   void  __iomem *phyregs; /* Base of register bank used for PHY access */
+   
+   int phy;/* Index of PHY for this interface */
+   int irq_num;
+   int id;
+
+   struct timer_list timer;/* Timer that triggers the check phy function */
+   int rxtail; /* Next entry in rxring to read */
+   int rxhead; /* Next entry in rxring to give a new buffer */
+   int rxfree; /* Number of free, allocated RX buffers */
+
+   int rxpending;  /* Non-zero if there are still descriptors
+* to be processed from a previous descriptor
+* interrupt condition that has been cleared */
+
+   int txtail; /* Next TX descriptor to check status on */
+   int txhead; /* Next TX descriptor to use */
+
+   /* Number of free TX descriptors.  This could be calculated from
+* rxhead and rxtail if one descriptor were left unused to disambiguate
+* full and empty conditions, but it's simpler to just keep track
+* explicitly. */
+
+   int txfree;
+
+   int phy_ok; /* The PHY is currently powered on. */
+
+   /* PHY status (duplex is 1 for half, 2 for full,
+* so that the default 0 indicates that neither has
+* yet been configured). */
+
+   int link_up;
+   int speed;
+   

[patch 1/3 v2] Add tsi108 On Chip Ethernet device driver support

2006-09-20 Thread Zang Roy-r61911
The Tundra Semiconductor Corporation (Tundra) Tsi108/9 is a host bridge
for PowerPC processors that offers numerous system interconnect options
for embedded application designers . The Tsi108/9 can interconnect 60x
or MPX processors to PCI/X peripherals, DDR2-400 memory, Gigabit
Ethernet, and Flash.
Tsi108/109 is used on powerpc/mpc7448hpc2 platform.
The following patch provides Tsi108/9 on chip Ethernet chip driver
config and Makefile.


Signed-off-by: Alexandre Bounine [EMAIL PROTECTED]
Signed-off-by: Roy Zang [EMAIL PROTECTED] 


--
 drivers/net/Kconfig  |8 
 drivers/net/Makefile |1 +
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index a2bd811..eb17060 100644
-- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2221,6 +2221,14 @@ config SPIDER_NET
  This driver supports the Gigabit Ethernet chips present on the
  Cell Processor-Based Blades from IBM.
 
+config TSI108_ETH
+  tristate Tundra TSI108 gigabit Ethernet support
+  depends on TSI108_BRIDGE
+  help
+This driver supports Tundra TSI108 gigabit Ethernet ports.
+To compile this driver as a module, choose M here: the module
+will be called tsi108_eth.
+
 config GIANFAR
tristate Gianfar Ethernet
depends on 85xx || 83xx || PPC_86xx
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 8427bf9..da199e7 100644
-- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -112,6 +112,7 @@ obj-$(CONFIG_B44) += b44.o
 obj-$(CONFIG_FORCEDETH) += forcedeth.o
 obj-$(CONFIG_NE_H8300) += ne-h8300.o 8390.o
 
+obj-$(CONFIG_TSI108_ETH) += tsi108_eth.o
 obj-$(CONFIG_MV643XX_ETH) += mv643xx_eth.o
 
 obj-$(CONFIG_PPP) += ppp_generic.o slhc.o
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/3] Add tsi108 On Chip Ethernet device driver support

2006-09-20 Thread Jeff Garzik

Zang Roy-r61911 wrote:

+#define TSI108_ETH_WRITE_REG(offset, val) \
+   writel(le32_to_cpu(val),data-regs + (offset))
+
+#define TSI108_ETH_READ_REG(offset) \
+   le32_to_cpu(readl(data-regs + (offset)))
+
+#define TSI108_ETH_WRITE_PHYREG(offset, val) \
+   writel(le32_to_cpu(val), data-phyregs + (offset))
+
+#define TSI108_ETH_READ_PHYREG(offset) \
+   le32_to_cpu(readl(data-phyregs + (offset)))



NAK:

1) writel() and readl() are defined to be little endian.

If your platform is different, then your platform should have its own 
foobus_writel() and foobus_readl().


2) TSI108_ETH_WRITE_REG() is just way too long.  TSI_READ(), 
TSI_WRITE(), TSI_READ_PHY() and TSI_WRITE_PHY() would be far more readable.


More in next email.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/3] Add tsi108 On Chip Ethernet device driver support

2006-09-20 Thread Jeff Garzik

Zang Roy-r61911 wrote:

+struct tsi108_prv_data {
+   void  __iomem *regs;/* Base of normal regs */
+   void  __iomem *phyregs; /* Base of register bank used for PHY access */
+   
+   int phy;/* Index of PHY for this interface */
+   int irq_num;
+   int id;
+
+   struct timer_list timer;/* Timer that triggers the check phy function */
+   int rxtail; /* Next entry in rxring to read */
+   int rxhead; /* Next entry in rxring to give a new buffer */
+   int rxfree; /* Number of free, allocated RX buffers */
+
+   int rxpending;  /* Non-zero if there are still descriptors
+* to be processed from a previous descriptor
+* interrupt condition that has been cleared */
+
+   int txtail; /* Next TX descriptor to check status on */
+   int txhead; /* Next TX descriptor to use */


most of these should be unsigned, to prevent bugs.



+   /* Number of free TX descriptors.  This could be calculated from
+* rxhead and rxtail if one descriptor were left unused to disambiguate
+* full and empty conditions, but it's simpler to just keep track
+* explicitly. */
+
+   int txfree;
+
+   int phy_ok; /* The PHY is currently powered on. */
+
+   /* PHY status (duplex is 1 for half, 2 for full,
+* so that the default 0 indicates that neither has
+* yet been configured). */
+
+   int link_up;
+   int speed;
+   int duplex;
+
+   tx_desc *txring;
+   rx_desc *rxring;
+   struct sk_buff *txskbs[TSI108_TXRING_LEN];
+   struct sk_buff *rxskbs[TSI108_RXRING_LEN];
+
+   dma_addr_t txdma, rxdma;
+
+   /* txlock nests in misclock and phy_lock */
+
+   spinlock_t txlock, misclock;
+
+   /* stats is used to hold the upper bits of each hardware counter,
+* and tmpstats is used to hold the full values for returning
+* to the caller of get_stats().  They must be separate in case
+* an overflow interrupt occurs before the stats are consumed.
+*/
+
+   struct net_device_stats stats;
+   struct net_device_stats tmpstats;
+
+   /* These stats are kept separate in hardware, thus require individual
+* fields for handling carry.  They are combined in get_stats.
+*/
+
+   unsigned long rx_fcs;   /* Add to rx_frame_errors */
+   unsigned long rx_short_fcs; /* Add to rx_frame_errors */
+   unsigned long rx_long_fcs;  /* Add to rx_frame_errors */
+   unsigned long rx_underruns; /* Add to rx_length_errors */
+   unsigned long rx_overruns;  /* Add to rx_length_errors */
+
+   unsigned long tx_coll_abort;/* Add to tx_aborted_errors/collisions 
*/
+   unsigned long tx_pause_drop;/* Add to tx_aborted_errors */
+
+   unsigned long mc_hash[16];
+};
+
+/* Structure for a device driver */
+
+static struct platform_driver tsi_eth_driver = {
+   .probe = tsi108_init_one,
+   .remove = tsi108_ether_remove,
+   .driver = {
+   .name = tsi-ethernet,
+   },
+};
+
+static void tsi108_timed_checker(unsigned long dev_ptr);
+
+static void dump_eth_one(struct net_device *dev)
+{
+   struct tsi108_prv_data *data = netdev_priv(dev);
+
+   printk(Dumping %s...\n, dev-name);
+   printk(intstat %x intmask %x phy_ok %d
+   link %d speed %d duplex %d\n,
+  TSI108_ETH_READ_REG(TSI108_EC_INTSTAT),
+  TSI108_ETH_READ_REG(TSI108_EC_INTMASK), data-phy_ok,
+  data-link_up, data-speed, data-duplex);
+
+   printk(TX: head %d, tail %d, free %d, stat %x, estat %x, err %x\n,
+  data-txhead, data-txtail, data-txfree,
+  TSI108_ETH_READ_REG(TSI108_EC_TXSTAT),
+  TSI108_ETH_READ_REG(TSI108_EC_TXESTAT),
+  TSI108_ETH_READ_REG(TSI108_EC_TXERR));
+
+   printk(RX: head %d, tail %d, free %d, stat %x,
+   estat %x, err %x, pending %d\n\n,
+  data-rxhead, data-rxtail, data-rxfree,
+  TSI108_ETH_READ_REG(TSI108_EC_RXSTAT),
+  TSI108_ETH_READ_REG(TSI108_EC_RXESTAT),
+  TSI108_ETH_READ_REG(TSI108_EC_RXERR), data-rxpending);
+}
+
+/* Synchronization is needed between the thread and up/down events.
+ * Note that the PHY is accessed through the same registers for both
+ * interfaces, so this can't be made interface-specific.
+ */
+
+static DEFINE_SPINLOCK(phy_lock);


you should have a chip structure, that contains two structs (one for 
each interface/port)




+static u16 tsi108_read_mii(struct tsi108_prv_data *data, int reg, int *status)
+{
+   int i;
+   u16 ret;
+
+   TSI108_ETH_WRITE_PHYREG(TSI108_MAC_MII_ADDR,
+   (data-phy  TSI108_MAC_MII_ADDR_PHY) |
+   (reg  TSI108_MAC_MII_ADDR_REG));
+   

Re: [patch 3/3] Add tsi108 On Chip Ethernet device driver support

2006-09-20 Thread Zang Roy-r61911
On Thu, 2006-09-21 at 12:26, Jeff Garzik wrote:
 Zang Roy-r61911 wrote:
  +#define TSI108_ETH_WRITE_REG(offset, val) \
  + writel(le32_to_cpu(val),data-regs + (offset))
  +
  +#define TSI108_ETH_READ_REG(offset) \
  + le32_to_cpu(readl(data-regs + (offset)))
  +
  +#define TSI108_ETH_WRITE_PHYREG(offset, val) \
  + writel(le32_to_cpu(val), data-phyregs + (offset))
  +
  +#define TSI108_ETH_READ_PHYREG(offset) \
  + le32_to_cpu(readl(data-phyregs + (offset)))
 
 
 NAK:
 
 1) writel() and readl() are defined to be little endian.
 
 If your platform is different, then your platform should have its own 
 foobus_writel() and foobus_readl().

Tsi108 bridge is designed for powerpc platform. Originally, I use
out_be32() and in_be32(). While there is no obvious reason to object
using this bridge in a little endian system. Maybe some extra hardware
logic needed for the bus interface. le32_to_cpu() can  be aware the
endian difference.
Any comment?

 
 2) TSI108_ETH_WRITE_REG() is just way too long.  TSI_READ(), 
 TSI_WRITE(), TSI_READ_PHY() and TSI_WRITE_PHY() would be far more
 readable.
 
 More in next email.
 
I will modify the name.
Roy


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html