Vasantha Kumar Puttappa a écrit :
Hi All,
Please somebody guide me here. I desparatley need help regarding this
issue. ( plz do reply to all)
I am tracking all udp packets (in particular, SIP based UDP packets)that
goes through the iptables using LOG mechanism.
I use the following command,
John Miller a écrit :
Hi Eric,
I CCed netdev since this stuff is about network and not
lkml.
Ok, dropped the CC...
What kind of machine do you have ? SMP or not ?
It's a HP system with two dual core CPUs at 3GHz, the
storage system is connected through QLogic FC-HBA. It should
really be
Eric Dumazet a écrit :
John Miller a écrit :
Hi Eric,
I CCed netdev since this stuff is about network and not
lkml.
Ok, dropped the CC...
What kind of machine do you have ? SMP or not ?
It's a HP system with two dual core CPUs at 3GHz, the
storage system is connected through QLogic FC
On Tue, 22 May 2007 09:33:29 +0200
Marc Donner [EMAIL PROTECTED] wrote:
Hi,
I have tried to set up quagga with tcp-md5 support from kernel. All seems ok
with a intel e100 NIC, but as i testetd with a intel e1000 NIC the tcp
packets have an invalid md5 digest.
If i run tcpdump on the
Vasily Averin a écrit :
sys_setsockopt() do not check properly timeout values for
SO_RCVTIMEO/SO_SNDTIMEO, for example it's possible to set negative timeout
values. POSIX do not defines behaviour for sys_setsockopt in case negative
timeouts, but requires that setsockopt() shall fail with -EDOM
David Miller a écrit :
I've had several requests for the capability to change this
timeout, which I think is perfectly reasonable.
So I intend to merge the following upstream unless I hear
some objections :-)
commit 7191f131aff4797f2a906495c7b285d8adf47da2
Author: David S. Miller [EMAIL
Herbert Xu a écrit :
Andrew Morton [EMAIL PROTECTED] wrote:
It is possible to introduce UDP packet losses by reading
the proc file entry /proc/net/tcp. The really strange thing is that
the error counters for packet drops are not increased.
Please try this patch and let us know if it helps.
Herbert Xu a écrit :
On Fri, May 25, 2007 at 08:50:20AM +0200, Eric Dumazet wrote:
If this patch really helps, this means cond_resched_softirq()
doesnt work at all and should be fixed, or just zapped as it
is seldom used.
cond_resched_softirq lets other threads run if they want to.
It doesn't
LIMIT_NETDEBUG allows the admin to disable some warning messages (echo 0
/proc/sys/net/core/warnings).
The TCP: Treason uncloaked! message can use this facility.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index e613401..e9b151b
David Miller a écrit :
From: Eric Dumazet [EMAIL PROTECTED]
Date: Mon, 04 Jun 2007 09:13:40 +0200
LIMIT_NETDEBUG allows the admin to disable some warning messages (echo 0
/proc/sys/net/core/warnings).
The TCP: Treason uncloaked! message can use this facility.
Signed-off-by: Eric Dumazet
David
I discovered one big problem with UDP binding in 2.6.22-rc4 :
Consider you have eth0 with addr 192.168.0.1
Consider one UDP socket was bound to 192.168.0.1:32769. It will be stored on a
slot != 1
Another UDP socket is created and binded to (0.0.0.0:0)
__udp_lib_get_port() is called
On Thu, 7 Jun 2007 20:40:39 +0900
Tetsuo Handa [EMAIL PROTECTED] wrote:
Hello.
Same local ports are assigned to multiple sockets.
The following program should print different local port number.
- Start of program -
#include stdio.h
#include unistd.h
#include sys/socket.h
On Mon, 25 Jun 2007 10:28:38 +0530
Varun Chandramohan [EMAIL PROTECTED] wrote:
According to the RFC 4292 (IP Forwarding Table MIB) there is a need for an
age entry for all the routes in the routing table. The entry in the RFC is
inetCidrRouteAge and oid is inetCidrRouteAge.1.10.
Many snmp
On Wed, 4 Jul 2007 11:40:48 +0200
Robert Iakobashvili [EMAIL PROTECTED] wrote:
On 7/4/07, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
On Wed, Jul 04, 2007 at 09:50:31AM +0200, Robert Iakobashvili ([EMAIL
PROTECTED]) wrote:
If I am correct, a TCP server can make up to
64K accepts for a
Message original
Sujet: [RFC] Idea to speedup tcp lookups
Date: Tue, 02 Aug 2005 11:53:12 +0200
De: Eric Dumazet [EMAIL PROTECTED]
Pour: David S. Miller davem@redhat.com
Copie: [EMAIL PROTECTED], [EMAIL PROTECTED]
Références: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL
Andi Kleen a écrit :
David, do you think we could place file-private_data in the same cache
line than file-f_count and file-f_op, so that sockfd_lookup() can access
all the needed information (f_count, f_op, private_data) using one
L1_CACHE_LINE only ?
You mean for 32byte cache lines? Not
sockfd_lookups
Thank you
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux-2.6.13-rc6/include/linux/fs.h 2005-08-07 20:18:56.0 +0200
+++ linux-2.6.13-rc6-ed/include/linux/fs.h 2005-08-18 01:33:04.0
+0200
@@ -586,20 +586,19 @@
struct dentry *f_dentry
Coywolf Qi Hunt a écrit :
On 8/18/05, Eric Dumazet [EMAIL PROTECTED] wrote:
Andi Kleen a écrit :
(because of the insane struct file_ra_state f_ra. I wish this structure
were dynamically allocated only for files that really use it)
How about you submit a patch for that instead?
-Andi
David S. Miller a écrit :
From: Andi Kleen [EMAIL PROTECTED]
Date: Thu, 18 Aug 2005 03:05:25 +0200
I would just set the ra pointer to a single global structure if the
allocation fails. Then you can avoid all the other checks. It will
slow down things and trash some state, but not fail and
to f_count and f_op fields to speedup
sockfd_lookups
Thank you
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux-2.6.13-rc6/include/linux/fs.h 2005-08-07 20:18:56.0 +0200
+++ linux-2.6.13-rc6-ed/include/linux/fs.h 2005-08-18 10:30:35.0
+0200
@@ -586,20 +586,18
Hi all
I have strange numbers on a 4 way SMP Opteron machine, with a single tg3 NIC,
linux-2.6.13-rc6
I have about 12000 requeues per second.
oprofile data show high numbers for these related functions :
qdisc_restart() 2.6452 %
dev_queue_xmit() 0.9599 %
pfifo_fast_dequeue() 0.7094 %
David S. Miller a écrit :
No, all of your cpus are racing to get the transmit lock
of the tg3 driver. Whoever wins the race gets to queue
the packet, the others have to back off.
I believe the tg3_tx() holds the tx_lock for too long, in the case 200 or so
skb are delivered
Maybe adding a
.
Thank you
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- net-2.6.14/net/ipv4/tcp.c 2005-08-26 02:14:00.0 +0200
+++ net-2.6.14-ed/net/ipv4/tcp.c2005-08-26 02:20:08.0 +0200
@@ -269,7 +269,7 @@
int sysctl_tcp_fin_timeout = TCP_FIN_TIMEOUT;
-DEFINE_SNMP_STAT(struct
David S. Miller a écrit :
From: Eric Dumazet [EMAIL PROTECTED]
Date: Fri, 26 Aug 2005 03:07:06 +0200
On one of my production machine, tcp_statistics was sitting in a heavily
modified cache line, so *every* SNMP update had to force a reload.
But I disagree that statistics belong
Benjamin LaHaise a écrit :
On Fri, Aug 26, 2005 at 09:11:14AM +0200, Eric Dumazet wrote:
The patch I suggested only changed the root pointer, moving to read_mostly
section because it is really write once at boot, then read only. This is
the same with slab pointers : they are hot objects (read
David S. Miller a écrit :
From: Eric Dumazet [EMAIL PROTECTED]
Date: Wed, 24 Aug 2005 01:10:44 +0200
Looking at tg3_tx() more closely, I am not convinced it really needs
to lock tp-tx_lock during the loop. tp-tx_cons (swidx) is changed
in this function only, and could be changed
jamal a écrit :
On Sat, 2005-27-08 at 22:38 +0200, Eric Dumazet wrote:
(So about 360 requeues per second, much better than before (12000 / second))
I suspect what you are doing is shoving a lot more packets than the wire
can handle. Thats why you are getting the backpressure.
I read back
of Benjamin patch ?
Avoid touching file-f_dentry on sockets, since file-private_data directly
gives us the socket pointer.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux-2.6/net/socket.c 2005-09-06 01:20:25.0 +0200
+++ linux-2.6-ed/net/socket.c 2005-09-06 01:35:02.0
[cpu]);
+ }
+ kfree(info);
It should probably use vfree() like :
+ for_each_cpu(cpu) {
+ if (info-size = PAGE_SIZE)
+ kfree(info-entries[cpu]);
+ else
+ vfree(info-entries[cpu]);
+ }
See you
Eric Dumazet
On Thursday 05 October 2006 10:57, Evgeniy Polyakov wrote:
Well, it is possible to create /sys/proc entry for that, and even now
userspace can grow mapping ring until it is forbiden by kernel, which
means limit is reached.
No need for yet another /sys/proc entry.
Right now, I (for example)
On Thursday 05 October 2006 12:55, Evgeniy Polyakov wrote:
On Thu, Oct 05, 2006 at 12:45:03PM +0200, Eric Dumazet ([EMAIL PROTECTED])
What is missing or not obvious is : If events are skipped because of
overflows, What happens ? Connections stuck forever ? Hope that
everything
Hi David
While browsing net/ipv4/route.c I discovered compare_keys() function, and a
potential bug in it.
static inline int compare_keys(struct flowi *fl1, struct flowi *fl2)
{
return memcmp(fl1-nl_u.ip4_u, fl2-nl_u.ip4_u,
sizeof(fl1-nl_u.ip4_u)) == 0
fl1-oif ==
David Miller a écrit :
From: Eric Dumazet [EMAIL PROTECTED]
Date: Wed, 11 Oct 2006 15:11:18 +0200
Using memcmp(ptr1, ptr2, sizeof(SOMEFIELD)) is dangerous because
sizeof(SOMEFIELD) can be larger than the underlying object, because of
alignment constraints.
In this case, sizeof(fl1
David Miller a écrit :
From: Eric Dumazet [EMAIL PROTECTED]
Date: Thu, 12 Oct 2006 07:48:20 +0200
Not on my gcc here (gcc version 3.4.4) : It wont zeros out the padding bytes
Patrick just proved this too :)
Well, on this machine I have these oprofile numbers :
rt_intern_hash
On Thursday 12 October 2006 10:08, Martin Schiller wrote:
Hi!
I'm searching for a solution to suppress / delay the SYN-ACK packet of a
listening server (-application) until he has decided (e.g. analysed the
requesting ip-address or checked if the corresponding other end of a
connection is
On Thursday 12 October 2006 12:13, Martin Schiller wrote:
On Thursday, October 12, 2006 10:38 AM, Eric Dumazet wrote:
Well, it is already possible to delay the 'third packet' of an
outgoing connection with a litle hack. But AFAIK not the SYNACK of
incoming connection. It could be cool
On Thursday 12 October 2006 12:31, Evgeniy Polyakov wrote:
On Thu, Oct 12, 2006 at 12:13:26PM +0200, Martin Schiller ([EMAIL PROTECTED])
wrote:
On Thursday, October 12, 2006 10:38 AM, Eric Dumazet wrote:
Well, it is already possible to delay the 'third packet' of an
outgoing connection
to delete entries for at most one timer tick. CPUS
are going faster, hard limits are becoming useless... Similar thing is done in
net/ipv4/route.c garbage collector.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux-2.6.18/include/net/inetpeer.h Wed Sep 20 05:42:06 2006
+++ linux-2.6.18-ed
David Miller a écrit :
From: Eric Dumazet [EMAIL PROTECTED]
Date: Thu, 12 Oct 2006 22:14:12 +0200
1) shrink struct inet_peer on 64 bits platforms.
I noticed sizeof(struct inet_peer) was 64+8 on x86_64
As we dont really need 64 bits timestamps
Rick Jones a écrit :
More to the point, on what basis would the application be rejecting a
connection request based solely on the SYN?
True, it isn't like there would suddenly be any call user data as in
XTI/TLI.
DATA payload could be included in the SYN packet. TCP specs allow this AFAIK.
David Miller a écrit :
From: Eric Dumazet [EMAIL PROTECTED]
Date: Fri, 13 Oct 2006 05:56:43 +0200
2^31 is 2147483648
Thats a *lot* of timer ticks, an inet_peer entry should not stay in
unused_list for more than 10 minutes.
My bad, I thought the time was compared to the creation time
Rick Jones a écrit :
Eric Dumazet wrote:
Rick Jones a écrit :
More to the point, on what basis would the application be rejecting a
connection request based solely on the SYN?
True, it isn't like there would suddenly be any call user data as in
XTI/TLI.
DATA payload could be included
Hi David
While browsing include/net/request_sock.h I found this suspicious locking
protecting the SYN table hash table. I think this patch is necessary.
Thank you
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux-2.6.18/include/net/request_sock.h.orig2006-10-16
10:53
(Sorry, patch inlined this time)
Hi David
While browsing include/net/request_sock.h I found this suspicious locking
protecting the SYN table hash table. I think this patch is necessary.
Thank you
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux-2.6.18/include/net/request_sock.h.orig
On Monday 16 October 2006 18:16, Arnaldo Carvalho de Melo wrote:
On 10/16/06, Eric Dumazet [EMAIL PROTECTED] wrote:
(Sorry, patch inlined this time)
Hi David
While browsing include/net/request_sock.h I found this suspicious locking
protecting the SYN table hash table. I think
On Monday 16 October 2006 18:56, Eric Dumazet wrote:
On Monday 16 October 2006 18:16, Arnaldo Carvalho de Melo wrote:
On 10/16/06, Eric Dumazet [EMAIL PROTECTED] wrote:
(Sorry, patch inlined this time)
Hi David
While browsing include/net/request_sock.h I found this suspicious
On Tuesday 17 October 2006 02:53, Eric Barton wrote:
If so, do you have any ideas about how to do it more economically? It's 2
pointers rather than 1 to avoid forcing an unnecessary packet boundary
between successive zero-copy sends. But I guess that might not be hugely
significant since
On Tuesday 17 October 2006 14:04, Martin Schiller wrote:
On Monday, October 16, 2006 9:02 AM, Lennert Buytenhek wrote:
I wrote something like this a couple of years ago:
http://marc.theaimsgroup.com/?l=linux-netdevm=103666165629419w=2
On Tuesday 17 October 2006 12:39, Evgeniy Polyakov wrote:
I can add such notification, but its existense _is_ the broken design.
After such condition happend, all new events will dissapear (although
they are still accessible through usual queue) from mapped buffer.
While writing this I have
On Tuesday 17 October 2006 15:42, Evgeniy Polyakov wrote:
On Tue, Oct 17, 2006 at 03:19:36PM +0200, Eric Dumazet ([EMAIL PROTECTED])
wrote:
On Tuesday 17 October 2006 12:39, Evgeniy Polyakov wrote:
I can add such notification, but its existense _is_ the broken design.
After
On Tuesday 17 October 2006 16:07, Evgeniy Polyakov wrote:
On Tue, Oct 17, 2006 at 03:52:34PM +0200, Eric Dumazet ([EMAIL PROTECTED])
wrote:
What about the case, which I described in other e-mail, when in case of
the full ring buffer, no new events are written there, and when
userspace
On Tuesday 17 October 2006 17:09, Evgeniy Polyakov wrote:
On Tue, Oct 17, 2006 at 04:25:00PM +0200, Eric Dumazet ([EMAIL PROTECTED])
wrote:
On Tuesday 17 October 2006 16:07, Evgeniy Polyakov wrote:
On Tue, Oct 17, 2006 at 03:52:34PM +0200, Eric Dumazet
([EMAIL PROTECTED])
wrote
On Tuesday 17 October 2006 18:01, Evgeniy Polyakov wrote:
Ok, there is one apologist for mmap buffer implementation, who forced me
to create first implementation, which was dropped due to absense of
remote mental reading abilities.
Ulrich, does above approach sound good for you?
I actually
On Tuesday 17 October 2006 18:35, Evgeniy Polyakov wrote:
On Tue, Oct 17, 2006 at 06:26:04PM +0200, Eric Dumazet ([EMAIL PROTECTED])
wrote:
On Tuesday 17 October 2006 18:01, Evgeniy Polyakov wrote:
Ok, there is one apologist for mmap buffer implementation, who forced
me to create first
Evgeniy Polyakov a e'crit :
On Tue, Oct 17, 2006 at 06:45:54PM +0200, Eric Dumazet ([EMAIL PROTECTED])
wrote:
I am not sure I understand what you wrote, English is not our native language.
I think many people gave you feedbacks. I feel that all feedback on this
mailing list is constructive
)
As many routers are base on PIII (L1_CACHE_SIZE=32), this saves one cache line
per rtable entry.
Thank you
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux-2.6.19-rc2/include/net/flow.h 2006-10-18 06:03:08.0 +0200
+++ linux-2.6.19-rc2-ed/include/net/flow.h 2006-10-18 06:56
YOSHIFUJI Hideaki / a écrit :
In article [EMAIL PROTECTED] (at Wed, 18 Oct 2006 07:08:07 +0200), Eric Dumazet
[EMAIL PROTECTED] says:
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
struct {
struct in6_addr daddr
David Miller a écrit :
From: Eric Dumazet [EMAIL PROTECTED]
Date: Wed, 18 Oct 2006 07:08:07 +0200
Each route entry includes a 'struct flow'. This structure has a
current size of 80 bytes. This patch makes a size reduction
depending on
CONFIG_IPV6/CONFIG_IPV6_MODULE/CONFIG_DECNET
Hi David
Lot of routers still use CPUS with 32 bytes cache lines. (Intel PIII)
It make sense to make sure fields used at lookup time are in the same cache
line, to reduce cache footprint and speedup lookups.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux/include/net/inetpeer.h
On Wednesday 18 October 2006 10:20, Steven Whitehouse wrote:
Hi,
On Tue, Oct 17, 2006 at 11:53:36PM -0700, David Miller wrote:
From: Eric Dumazet [EMAIL PROTECTED]
Date: Wed, 18 Oct 2006 07:42:17 +0200
How many people are using DECNET and want to pay the price of this
20 bytes
On Wednesday 18 October 2006 14:42, Steven Whitehouse wrote:
Hi,
Its not used at the moment[*], but would be required for any kind of
flow tracking. The objnum field, could be folded into the objname field
I guess on the basis that objnamel == 0 means objname[0] represents the
too in loopback_xmit() not updating 4 fields, but 2.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux/drivers/net/loopback.c2006-10-18 17:28:20.0 +0200
+++ linux-ed/drivers/net/loopback.c 2006-10-18 18:26:41.0 +0200
@@ -58,7 +58,11 @@
#include linux/tcp.h
We dont need a full struct net_device_stats (currently 23 long : 184 bytes on
x86_64) per possible CPU, but only two counters : bytes and packets
We save few CPU cycles too in loopback_xmit() not updating 4 fields, but 2.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux/drivers/net
Michael Tokarev a e'crit :
Any idea how to force sending FIN-with-data?
int flag_on = 1;
setsockopt(fd, SOL_TCP, TCP_CORK, flag_on, sizeof(int));
send(fd, data, datalen, 0);
close(fd);
Eric Dumazet
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message
Michael Tokarev a écrit :
Eric Dumazet wrote:
Michael Tokarev a e'crit :
Any idea how to force sending FIN-with-data?
int flag_on = 1;
setsockopt(fd, SOL_TCP, TCP_CORK, flag_on, sizeof(int));
send(fd, data, datalen, 0);
close(fd);
That produces two packets - one (or more - depending
Jeff Kirsher a écrit :
- This patch is to improve performance by adding prefetch to the ixgb driver
- Add driver comments
Signed-off-by: Jeff Kirsher [EMAIL PROTECTED]
Signed-off-by: Jesse Brandeburg [EMAIL PROTECTED]
Signed-off-by: John Ronciak [EMAIL PROTECTED]
---
On a dual opteron box, I noticed high oprofile numbers in net/core/dst.c
, function dst_destroy(struct dst_entry * dst)
It appears the smb_rmb() done at the begining of dst_destroy() is the
killer (this is a lfence machine instruction, that apparently is doing
a *lot* of things... may be IO
I noticed that after a 'ip route flush cache' (manual or timer
triggered) on a busy server, X entries are added to dst_garbage_list.
(X depends on the number of established sockets)
Every 1/10th second (DST_GC_MIN) , net/core/dst.c::dst_run_gc() is
fired, and try to free some entries
raw_smp_processor_id()
were appropriate.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
raw_smp_processor_id()
were appropriate.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- net-2.6.16-orig/net/ipv4/netfilter/ip_tables.c 2005-11-25
10:24:02.0 +0100
+++ net-2.6.16/net/ipv4/netfilter/ip_tables.c 2005-11-25 11:44:40.0
+0100
@@ -988,11 +988,14 @@
{
unsigned
David S. Miller a écrit :
This gives further credence to BSD's hostcache which makes it use PMTU
metrics only learned by TCP. I still dislike the reduced granularity
of such a scheme, since as we all know ipsec routes can have wildly
different metrics and can be keyed by things like port
Herbert Xu a écrit :
Jayachandran C. [EMAIL PROTECTED] wrote:
diff -ur linux-2.6.15-rc3-git1.clean/net/ipv4/udp.c
linux-2.6.15-rc3-git1/net/ipv4/udp.c
--- linux-2.6.15-rc3-git1.clean/net/ipv4/udp.c Wed Nov 30 21:55:27 2005
+++ linux-2.6.15-rc3-git1/net/ipv4/udp.cThu Dec 1 05:23:40
!
Acked-by: Eric Dumazet [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ronciak, John a écrit :
In this combination of hardware and in this forwarding test
copybreak is bad but prefetching helps.
e1000 vanilla 1150 kpps
e1000 6.2.151084
e1000 6.2.15 copybreak disabled
David S. Miller a écrit :
I agree with the analysis, but I truly hate knobs. Every new
one we add means it's even more true that you need to be a wizard
to get a Linux box performing optimally.
[rant mode]
Well, I suspect this is the reason why various hash tables (IP route cache,
TCP
in cases like small packet routing is being
done.
I am no longer sure that your results on copybreak for host bound
packets can be trusted anymore. All your copybreak was doing was making
the prefetch look good according to my tests.
Eric Dumazet [EMAIL PROTECTED] theorized there may be some value
David S. Miller a écrit :
From: jamal [EMAIL PROTECTED]
Date: Wed, 07 Dec 2005 16:37:10 -0500
I think there is value for prefetch - just not the way the current patch
has it. Something less adventorous as suggested by Robert would make a
lot more sense.
Looking at the e1000 patch in
John Ronciak a écrit :
On 12/7/05, David S. Miller [EMAIL PROTECTED] wrote:
Keyword, this box.
We don't disagree and never have with this. It's why we were asking
the question of find us a case where the prefetch shows a detriment to
performance. I think Jesse's data and recommendation of
Robert Olsson a écrit :
David S. Miller writes:
For the host bound case, copybreak is always a way due to how
socket buffer accounting works. If you use a 1500 byte SKB for
64 bytes of data, this throws off all of the socket buffer
accounting because you're consuming more of the socket
Jesse Brandeburg a écrit :
On 12/7/05, David S. Miller [EMAIL PROTECTED] wrote:
From: Eric Dumazet [EMAIL PROTECTED]
Date: Thu, 08 Dec 2005 04:47:05 +0100
#4#5 as proposed in the patch can not be a win
+ prefetch(next_skb);
+ prefetch(next_skb-data - NET_IP_ALIGN
of false sharing on SMP, and speedup some
socket system calls.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux-2.6.15-rc5/include/net/protocol.h 2005-12-04 06:10:42.0
+0100
+++ linux-2.6.15-rc5-ed/include/net/protocol.h 2005-12-17 11:21:22.0
+0100
@@ -65,7 +65,7
entry of this queue.
I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest
entry cannot still be in use by another CPU. This might sounds as a violation
of RCU rules, (I'm not an RCU expert) but seems quite reasonable.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED
stress tests, I could not reproduce OOM anymore after applying this patch.
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- linux-2.6.15/kernel/rcupdate.c 2006-01-03 04:21:10.0 +0100
+++ linux-2.6.15-edum/kernel/rcupdate.c 2006-01-06 13:32:02.0 +0100
@@ -71,14 +71,14
Andi Kleen a écrit :
On Friday 06 January 2006 11:17, Eric Dumazet wrote:
I assume that if a CPU queued 10.000 items in its RCU queue, then the
oldest entry cannot still be in use by another CPU. This might sounds as a
violation of RCU rules, (I'm not an RCU expert) but seems quite reasonable
Alan Cox a écrit :
On Gwe, 2006-01-06 at 11:17 +0100, Eric Dumazet wrote:
I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest
entry cannot still be in use by another CPU. This might sounds as a violation
of RCU rules, (I'm not an RCU expert) but seems quite reasonable
Paul E. McKenney a écrit :
On Fri, Jan 06, 2006 at 01:37:12PM +, Alan Cox wrote:
On Gwe, 2006-01-06 at 11:17 +0100, Eric Dumazet wrote:
I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest
entry cannot still be in use by another CPU. This might sounds as a violation
Andi Kleen a écrit :
I always disliked the per chain spinlocks even for other hash tables like
TCP/UDP multiplex - it would be much nicer to use a much smaller separately
hashed lock table and save cache. In this case the special case of using
a one entry only lock hash table makes sense.
David S. Miller a écrit :
From: Eric Dumazet [EMAIL PROTECTED]
Date: Sat, 07 Jan 2006 08:34:35 +0100
I agree, I do use a hashed spinlock array on my local tree for TCP,
mainly to reduce the hash table size by a 2 factor.
So what do you think about going to a single spinlock for the
routing
David S. Miller a écrit :
Eric, how important do you honestly think the per-hashchain spinlocks
are? That's the big barrier from making rt_secret_rebuild() a simple
rehash instead of flushing the whole table as it does now.
No problem for me in going to a single spinlock.
I did the hashed
Rogier Wolff a écrit :
On Wed, Jan 11, 2006 at 02:43:49PM +0100, Erik Mouw wrote:
The system only recovers after the Netdev watchdog found out that the
transmit timed out. However, the e1000 register dump starts about 4 to
5 seconds earlier: a possible workaround would be to trigger the
timeout
of a too big
increase of bss (in UP mode) or static per_cpu data for SMP
(PERCPU_ENOUGH_ROOM is currently 32768 bytes)
Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- net-2.6/net/ipv4/route.c2006-01-17 10:51:24.0 +0100
+++ net-2.6-ed/net/ipv4/route.c 2006-01-17 11:25:33.0
Ravikiran G Thirumalai a écrit :
Change the atomic_t sockets_allocated member of struct proto to a
per-cpu counter.
Signed-off-by: Pravin B. Shelar [EMAIL PROTECTED]
Signed-off-by: Ravikiran Thirumalai [EMAIL PROTECTED]
Signed-off-by: Shai Fultheim [EMAIL PROTECTED]
Hi Ravikiran
If I
Ravikiran G Thirumalai a écrit :
Change struct proto-memory_allocated to a batching per-CPU counter
(percpu_counter) from an atomic_t. A batching counter is better than a
plain per-CPU counter as this field is read often.
Signed-off-by: Pravin B. Shelar [EMAIL PROTECTED]
Signed-off-by:
Ravikiran G Thirumalai a écrit :
On Fri, Jan 27, 2006 at 12:16:02PM -0800, Andrew Morton wrote:
Ravikiran G Thirumalai [EMAIL PROTECTED] wrote:
which can be assumed as not frequent.
At sk_stream_mem_schedule(), read_sockets_allocated() is invoked only
certain conditions, under memory
Ravikiran G Thirumalai a écrit :
On Fri, Jan 27, 2006 at 11:30:23PM +0100, Eric Dumazet wrote:
There are several issues here :
alloc_percpu() current implementation is a a waste of ram. (because it uses
slab allocations that have a minimum size of 32 bytes)
Oh there was a solution
Andrew Morton a écrit :
Eric Dumazet [EMAIL PROTECTED] wrote:
Ravikiran G Thirumalai a écrit :
On Fri, Jan 27, 2006 at 12:16:02PM -0800, Andrew Morton wrote:
Ravikiran G Thirumalai [EMAIL PROTECTED] wrote:
which can be assumed as not frequent.
At sk_stream_mem_schedule
Eric Dumazet a écrit :
Andrew Morton a écrit :
Eric Dumazet [EMAIL PROTECTED] wrote:
Ravikiran G Thirumalai a écrit :
On Fri, Jan 27, 2006 at 12:16:02PM -0800, Andrew Morton wrote:
Ravikiran G Thirumalai [EMAIL PROTECTED] wrote:
which can be assumed as not frequent
Andrew Morton a écrit :
Eric Dumazet [EMAIL PROTECTED] wrote:
An advantage of retaining a spinlock in percpu_counter is that if accuracy
is needed at a low rate (say, /proc reading) we can take the lock and then
go spill each CPU's local count into the main one. It would need to be a
very low
Ravikiran G Thirumalai a écrit :
On Sat, Jan 28, 2006 at 01:35:03AM +0100, Eric Dumazet wrote:
Eric Dumazet a écrit :
Andrew Morton a écrit :
Eric Dumazet [EMAIL PROTECTED] wrote:
long percpu_counter_read_accurate(struct percpu_counter *fbc)
{
long res = 0;
int cpu
Benjamin LaHaise a écrit :
On Sat, Jan 28, 2006 at 01:28:20AM +0100, Eric Dumazet wrote:
We might use atomic_long_t only (and no spinlocks)
Something like this ?
Erk, complex and slow... Try using local_t instead, which is substantially
cheaper on the P4 as it doesn't use the lock prefix
1 - 100 of 6079 matches
Mail list logo