,
strerror(errno));
close(skfd);
exit(-1);
}
--
Chris Friesen| MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada| email: [EMAIL PROTECTED
Richard B. Johnson wrote:
On Fri, 6 Jul 2001, Chris Friesen wrote:
I am using the following snippet of code to find out some information about the
MII PHY interface of my ethernet device (which uses the tulip driver). When I
did some timing measurements with gettimeofday() I found
this is to
wait around doing nothing for a millisecond. Is there some subtle reason why we
would want to wait around for a millisecond before doing anything?
Thanks for your help,
Chris
--
Chris Friesen| MailStop: 043/33/F10
Nortel Networks | work: (613
Richard B. Johnson wrote:
On Fri, 6 Jul 2001, Chris Friesen wrote:
mdelay(1); /* One ms delay... */
...rest of code...
What? What kernel version?
The code here says:
/* Establish sync by sending at least 32 logic ones */
for (i = 32; i =0; i
I am trying to get some ideas on what the heck caused a problem with the network
at work, and I was hoping someone might have some ideas.
Yesterday we were having some major network problems, many machines were
completely bogged down. This morning I came in to work to find my linux box
Chris Friesen wrote:
The kicker is that the NIC with the MAC address in question happened to be in my
G4 box running linux (yellowdog, 2.2.17 kernel). It was a D-Link 530TX NIC, if
it matters. The linux box was not configured as a DHCP server or client, and
both interfaces on the box were
William Lee Irwin III wrote:
The sorts of like explicit decisions I'd like to be made for these are:
(1) In a mixture of tasks with varying nice numbers, a given nice number
corresponds to some share of CPU bandwidth. Implementations
should not have the freedom to change this
Peter Williams wrote:
To my mind scheduling
and load balancing are orthogonal and keeping them that way simplifies
things.
Scuse me if I jump in here, but doesn't the load balancer need some way
to figure out a) when to run, and b) which tasks to pull and where to
push them?
I suppose
Peter Williams wrote:
Chris Friesen wrote:
Scuse me if I jump in here, but doesn't the load balancer need some
way to figure out a) when to run, and b) which tasks to pull and where
to push them?
Yes but both of these are independent of the scheduler discipline in force.
It is not clear
Peter Williams wrote:
Chris Friesen wrote:
Suppose I have a really high priority task running. Another very high
priority task wakes up and would normally preempt the first one.
However, there happens to be another cpu available. It seems like it
would be a win if we moved one of those
Mark Glines wrote:
One minor question: is it even possible to be completely fair on SMP?
For instance, if you have a 2-way SMP box running 3 applications, one of
which has 2 threads, will the threaded app have an advantage here? (The
current system seems to try to keep each thread on a
Rogan Dawes wrote:
I guess my point was if we somehow get to an odd number of nanoseconds,
we'd end up with rounding errors. I'm not sure if your algorithm will
ever allow that.
And Ingo's point was that when it takes thousands of nanoseconds for a
single context switch, an error of half a
Rogan Dawes wrote:
My concern was that since Ingo said that this is a closed economy, with
a fixed sum/total, if we lose a nanosecond here and there, eventually
we'll lose them all.
I assume Ingo has set it up so that the system doesn't lose partial
nanoseconds, but rather they'd just be
Con Kolivas wrote:
Indeed we do change timeslice with nice on rt_tasks in mainline at the moment.
Truth is most rt programming couldn't care less about timeslices, but your
point about it deviating from the standard is valid. RSDL does not change
timeslice with nice on SCHED_RR tasks so it's
Lee Revell wrote:
Sounds like Wengophone is broken. It should be using RT threads for
time critical work, as JACK and Ardour2 are doing.
If the app has root privileges to set RT policy, then it could also set
deeply negative nice values as well.
Doesn't reallly help the regular user with
Con Kolivas wrote:
The practice of renicing kernel threads to negative nice values is of
questionable benefit at best, and at worst leads to larger latencies when
kernel threads are busy on behalf of other tasks.
What about the priority implications of the renicing? It seems a bit
iffy
Randy Dunlap wrote:
allmodconfig on i386:
WARNING: default_idle [arch/i386/kernel/apm.ko] undefined!
WARNING: machine_real_restart [arch/i386/kernel/apm.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
Please ignore.
I think that this was the result of doing
There has been some discussion on lkml about a function that would
either down a semaphore or else abort if it couldn't get the semaphore
in a certain amount of time. Something along the lines of:
down_timeout(struct semaphore *sem, long timeout);
Does something like this exist? Does
Chris Friesen wrote:
static inline int down_timeout(struct semaphore * sem, unsigned int
timeout)
{
int ret = down_trylock(sem);
if (!ret)
ret = __down_timeout(sem, timeout);
return ret;
}
Sorry, I think that should be:
static inline int down_timeout(struct semaphore
[EMAIL PROTECTED] wrote:
Actually, the *real* reason embedded systems end up using old versions is
much simpler.
They start developing their code on release 2.X.Y, and they keep their code
out-of-tree. Then, when they come up for air, and it's at 2.X.(Y+15), they
discover that we weren't
Towards the end of __oom_kill_task() we see the following comment/code:
/*
* We give our sacrificial lamb high priority and access to
* all the memory it needs. That way it should be able to
* exit() and clear out its resources quickly...
*/
I'm looking for some help decoding an x86 stack trace.
We have a process showing as defunct, but its parent is listed as init.
It shows up in top as using basically 100% cpu, and one cpu out of
four is showing spending all its time in sys.
Dumping all tasks via sysrq gave the following
Just so you know the context, I'm coming at this from the point of view
of an embedded call server designer.
Mark Hahn wrote:
why do you think fairness is good, especially always good?
Fairness is good because it promotes predictability. See the
deterministic section below.
even
I'm trying to test code paths dealing with fragmented memory, so I'd
like to have a simple way to cause fragmented memory in the kernel. Is
there any API in the kernel that would let me allocate two contiguous
pages, then free one of them?
I tried the following, but it triggers an oops in
Evgeniy Polyakov wrote:
I never ever tried to say _everything_ must be driven by events.
IO must be driven, it is a must IMO.
Do you disagree with Linus' post about the difficulty of treating
open(), fstat(), page faults, etc. as events? Or do you not consider
them to be IO?
Chris
-
To
Davide Libenzi wrote:
struct async_syscall {
unsigned long nr_sysc;
unsigned long params[8];
long *result;
};
And what would async_wait() return bak? Pointers to struct async_syscall
or pointers to result?
Either one has downsides. Pointer to struct async_syscall
Apparently the timeslice of the SCHED_RR process varies with nice level
the same way that it does for SCHED_OTHER. So while niceness doesn't
affect the priority of a SCHED_RR task, it does impact how much cpu it gets.
SUSv3 indicates, Any processes or threads using SCHED_FIFO or SCHED_RR
Apparently the timeslice of the SCHED_RR process varies with nice level
the same way that it does for SCHED_OTHER. So while niceness doesn't
affect the priority of a SCHED_RR task, it does impact how much cpu it gets.
SUSv3 indicates, Any processes or threads using SCHED_FIFO or SCHED_RR
Stephen Hemminger wrote:
+arp_notify - BOOLEAN
+ Define mode for notification of address and device changes.
+ 0 - (default): do nothing
+ 1 - Generate gratuitous arp replies when device is brought up
+ or hardware address changes.
Did you consider using gratuitous
I still haven't seen any replies, so I'm resending with a few more
people directly in the TO list.
The timeslice of a SCHED_RR process currently varies with nice level the
same way that it does for SCHED_OTHER. I've included a small app below
that demonstrates the issue. So while niceness
In __oom_kill_task(), there is a comment that says,
We give our sacrificial lamb high priority and access to all the memory
it needs. That way it should be able to exit() and clear out its
resources quickly...
However, we don't actually change the priority at that point, we just
give it a
Hi all,
We're seeing the following on startup:
Fusion MPT base driver 3.02.55
Copyright (c) 1999-2005 LSI Logic Corporation
Fusion MPT SAS Host driver 3.02.55
mptbase: Initiating ioc0 bringup
mptbase: ioc0: WARNING - IOC is in FAULT state!!!
FAULT code = 1804h
mptbase: ioc0: ERROR -
Alasdair G Kergon wrote:
On Wed, Jan 09, 2008 at 11:46:03PM +0100, Andi Kleen wrote:
struct inode *inode = file-f_dentry-d_inode;
And oops if that's not defined?
Isn't this basically identical to what was being passed in to .ioctl()?
Chris
--
To unsubscribe from this list: send the line
Hi all,
I've got an issue that's popped up with a deployed system running
2.6.10. I'm looking for some help figuring out why incoming network
packets aren't being processed fast enough.
After a recent userspace app change, we've started seeing packets being
dropped by the ethernet hardware
Kok, Auke wrote:
You're using 2.6.10... you can always replace the e1000 module with the
out-of-tree version from e1000.sf.net, this might help a bit - the version in
the
2.6.10 kernel is very very old.
Do you have any reason to believe this would improve things? It seems
like the problem
James Chapman wrote:
What's changed in your application? Any real-time threads in there?
From the top output below, looks like SigtranServices is consuming all
your CPU...
There are two cpus, and SigtranServices is multithreaded with many
threads. Most of these threads are affined to
David Miller wrote:
You have to be kidding, coming here for help with a nearly
4 year old kernel.
I figured it couldn't hurt to ask...if I can't ask the original authors,
who else is there?
I'd love to work on newer kernels, but we have a commitment to our
customers to support multiple
Ray Lee wrote:
On Jan 10, 2008 9:24 AM, Chris Friesen [EMAIL PROTECTED] wrote:
After a recent userspace app change, we've started seeing packets being
dropped by the ethernet hardware (e1000, NAPI is enabled). The
error/dropped/fifo counts are going up in ethtool:
Can you reproduce
David Miller wrote:
From: Chris Friesen [EMAIL PROTECTED]
Date: Fri, 11 Jan 2008 08:59:26 -0600
I'd love to work on newer kernels, but we have a commitment to our
customers to support multiple releases for a significant amount of time.
And by asking here for people to dig into it for you
Eric Dumazet wrote:
Chris Friesen a écrit :
Based on profiling and instrumentation it seems like the cost of
sctp_endpoint_lookup_assoc() more than triples, which means that the
amount of time that bottom halves are disabled in that function also
triples.
Any idea of the size of sctp hash
Eric Dumazet wrote:
Chris Friesen a écrit :
Based on the profiling information we're spending time in
sctp_endpoint_lookup_assoc() which doesn't actually use hashes, so I
can't see how the hash would be related. I'm pretty new to SCTP
though, so I may be missing something.
Well, it does
Jan Engelhardt wrote:
On Sep 11 2007 21:26, Chris Friesen wrote:
Thunderbird, at least, will automatically inline a single text/plain attachment
when replying. (At least with my current settings, it does.)
No, the thing is: you send it attached with Thunderbird,
and my PINE strips
No responses in a couple days so I'm resending. I've CC'd a few people
who've touched binfmt_elf.c recently.
We've got an unusual elf binary and we seem to be running into a bug in
the elf loader. I'm not an elf expert, so my apologies if I get the
terminology wrong.
The elf spec says
Michal Piotrowski wrote:
On 11/09/2007, Chris Friesen [EMAIL PROTECTED] wrote:
We're running a modified 2.6.10 on a dual-Xeon system.
Eh, this is a pretty ancient kernel.
Yes, it is.
You may want to use one of the long time support kernel 2.6.16.x or 2.6.20.x.
I wish I could
Jeremy Fitzhardinge wrote:
Chris Friesen wrote:
The elf spec says that PT_LOAD segments must be ordered by vaddr. We
want to have a segment at a relatively low fixed vaddr. The exact
address is not important, except that it's lower than the standard elf
headers and so it must be the first
David Schwartz wrote:
Nonsense. The task is always ready-to-run. There is no reason its CPU should
be low. This bug report is based on a misunderstanding of what yielding
means.
The yielding task has given up the cpu. The other task should get to
run for a timeslice (or whatever the
Ingo Molnar wrote:
The correct way to tell the kernel that the task is blocked is to use
futexes for example, or any kernel-based locking or wait object - there
are myriads of APIs for these. (The only well-defined behavior of yield
is for SCHED_FIFO/RR tasks - and that is fully preserved in
David Miller wrote:
When you select VLAN, you by definition are asking for non-VLAN
traffic to be elided. It is like plugging the ethernet cable
into one switch or another.
For max functionality it seems like the raw eth device should show
everything on the wire in promiscuous mode.
If we
Suppose I send down an SG_IO command on a generic scsi device node. As
far as I can tell, the code path looks like this in 2.6.14:
sg_ioctl
sg_new_write
scsi_execute_async (sets up sg_cmd_done as callback)
scsi_do_req
Michael Gerdau wrote:
That having said:
I really do like such obvious (as in: for those knowing the stuff anyway)
comments when looking at code and probably concepts I'm not familiar with.
IMO there is no need to belittle this type of comment. IMO any casual
reader not familiar with the whole
Linus Torvalds wrote:
So the _only_ explanation today for 12GB on a 32-bit machine is
(a) insanity
or
(b) being so lazy as to not bother to upgrade
and in either case, my personal reaction is I'm *not* crazy, and yes, I'm
lazy too, and I can't give a rats *ss about those problems.
How
YOSHIFUJI Hideaki / 吉藤英明 wrote:
In article [EMAIL PROTECTED] (at Wed, 12 Dec 2007
15:57:08 -0600), Chris Friesen [EMAIL PROTECTED] says:
You may try other versions of this command
http://devresources.linux-foundation.org/dev/iproute2/download/
They appear to be numbered by kernel version
Patrick McHardy wrote:
From a kernel perspective there are only complete dumps, the
filtering is done by iproute. So the fact that it shows them
when querying specifically implies there is a bug in the
iproute neighbour filter. Does it work if you omit all
from the ip neigh show command?
Herbert Xu wrote:
Chris Friesen [EMAIL PROTECTED] wrote:
However, if I specifically try to print out one of the missing entries,
it shows up:
[EMAIL PROTECTED]:/root /tmp/ip neigh show 192.168.24.81
192.168.24.81 dev bond2 lladdr 00:01:af:14:e9:8a REACHABLE
What about
ip -4 neigh
David Schwartz wrote:
I've asked versions of this question at least three times and never
gotten
anything approaching a straight answer:
1) What is the current default 'sched_yield' behavior?
2) What is the current alternate 'sched_yield' behavior?
I'm pretty sure
David Schwartz wrote:
Chris Friesen wrote:
If CFS really can't support sched_yield's semantics, then it should just
not, and that's that. Return ENOSYS and admit that the behavior sched_yield
is documented to have simply can't be supported by the scheduler.
That's just it though
Over on comp.os.linux.development.system someone asked an interesting
question, and I thought I'd mention it here.
Given a fast low-latency solid state drive, would it ever be beneficial
to simply wait in the kernel for synchronous read/write calls to
complete? The idea is that you could
We have a network with a number of nodes using bonding with arp
monitoring. The arp interval is set to 100ms.
Unfortunately, the bonding code sends the arp packets to the hardware
broadcast address, which means that the number of these arp packets seen
by each node goes up with the number
Jared Hulbert wrote:
Magnetic drives have latencies ~10 milliseconds, current SSD's are an
order of magnitude better (~1 millisecond), new interfaces and
refinements could theoretically get us down one more (~100
microsecond).
They've already done already better than that. Here's a solid
Andi Kleen wrote:
This document describes Linux Netlink, which is used in Linux both as
an intra-kernel messaging system as well as between kernel and user
space.
It can be used between user space daemons as well. In fact it is.
e.g. they often listen to each other's messages.
One
Andi Kleen wrote:
Latency was very
important, so we ended up doing essentially a multicast unix socket
rather than taking the extra penalty for UDP multicast.
What extra penalty? Local UDP shouldn't be much more expensive than Unix.
On a 1.4GHz P4 I measured a 44% increase in latency
Andi Kleen wrote:
On a 1.4GHz P4 I measured a 44% increase in latency between a unix
datagram and a UDP datagram.
That's weird.
I just reran on a 3.2GHZ P4 running 2.6.11 (Fedora Core 4). 42% latency
increase.
For stream sockets, unix gives approximately a 62% bandwidth increase
over
Andi Kleen wrote:
On Thu, Dec 06, 2007 at 05:02:40PM -0600, Chris Friesen wrote:
I just reran on a 3.2GHZ P4 running 2.6.11 (Fedora Core 4). 42% latency
increase.
Sounds like something that should be looked into. I know of no
principal reasons for that.
For stream sockets, unix gives
David Miller wrote:
From: Chris Friesen [EMAIL PROTECTED]
Date: Thu, 06 Dec 2007 14:36:54 -0600
One problem we ran into was that there are only 32 multicast groups per
netlink protocol family.
I'm pretty sure we've removed this limitation.
As of 2.6.23 nl_groups is a 32-bit bitmask
David Miller wrote:
The kernel supports much more than 32 groups, see nlk-groups which is
a bitmap which can be sized to arbitrary sizes. nlk-nl_groups is
for backwards compatability only.
netlink_change_ngroups() does the bitmap resizing when necessary.
Thanks for the explanation. Given
I'm seeing some strange behaviour on a 2.6.14 ppc64 system. If I run
ip neigh show it prints out nothing, but if I run arp then I see the
other nodes on the local network.
[EMAIL PROTECTED]:/root ip neigh show
[EMAIL PROTECTED]:/root arp -n
Address HWtype HWaddress
Chris Friesen wrote:
I'm seeing some strange behaviour on a 2.6.14 ppc64 system. If I run
ip neigh show it prints out nothing, but if I run arp then I see the
other nodes on the local network.
[EMAIL PROTECTED]:/root ip neigh show
[EMAIL PROTECTED]:/root arp -n
Address
I retested it on an x86 machine and am seeing similar problems.
First, arp gives the arp table as expected:
[EMAIL PROTECTED]:/tftpboot/cnp/0-0-5-0/0-0-5-0 arp -n
Address HWtype HWaddress Flags Mask
Iface
172.24.0.9 ether 00:03:CC:51:06:5E C
Eric Dumazet wrote:
Chris Friesen a écrit :
Is this expected behaviour?
Probably not... Still a 2.6.14 kernel ?
Yep. Embedded hardware, so I'm unable to test with a more recent kernel.
Could you send the result of :
strace ip neigh show
I've attached two strace runs, one of ip neigh
Eric Dumazet wrote:
And what is the version of ip command you have on this machine ?
ip -V
iproute2-ss051107
You may try other versions of this command
http://devresources.linux-foundation.org/dev/iproute2/download/
They appear to be numbered by kernel version, and the above version is
Andrew Haley wrote:
We're listening, really. It's unacceptable that gcc should break
code.
In that case a conversion of a conditional branch to an unconditional
write to a visible variable is not an acceptable behaviour. Aside from
the kernel issues, it would break any number of threaded
Ulrich Drepper wrote:
I agree. Applications shouldn't be expected to be yet more complicated
and have different levels of low memory handling. You might want to
give a process a second shot at handling SIGDANGER but after that's it's
all about preparation for a shutdown.
I disagree. From
Samuel Tardieu wrote:
Pavel == Pavel Machek [EMAIL PROTECTED] writes:
Pavel That works okay on a PC, but try cellphone one day.
Pavel You want management app to close the least used
Pavel application. You do not want _kernel_ to select who to send
Pavel SIGTERM to.
That's why I would prefer
On 10/31/2012 02:14 PM, Oliver Neukum wrote:
On Wednesday 31 October 2012 17:39:19 Alan Cox wrote:
On Wed, 31 Oct 2012 17:17:43 +
Matthew Garrettmj...@srcf.ucam.org wrote:
On Wed, Oct 31, 2012 at 05:21:21PM +, Alan Cox wrote:
On Wed, 31 Oct 2012 17:10:48 +
Matthew
On 11/01/2012 02:27 PM, Pavel Machek wrote:
Could someone write down exact requirements for Linux kernel to be signed by
Microsoft?
Because thats apparently what you want, and I don't think crippling
kexec/suspend is
enough.
As I understand it, the kernel won't be signed by Microsoft.
On 11/02/2012 09:48 AM, Vivek Goyal wrote:
On Thu, Nov 01, 2012 at 03:02:25PM -0600, Chris Friesen wrote:
With secure boot enabled, then the kernel should refuse to let an
unsigned kexec load new images, and kexec itself should refuse to
load unsigned images.
Yep, good in theory. Now
On 11/02/2012 04:03 PM, Eric W. Biederman wrote:
Matthew Garrettmj...@srcf.ucam.org writes:
On Fri, Nov 02, 2012 at 01:49:25AM -0700, Eric W. Biederman wrote:
When the goal is to secure Linux I don't see how any of this helps.
Windows 8 compromises are already available so if we turn most
On 11/03/2012 09:40 AM, Michal Zatloukal wrote:
On Sat, Nov 3, 2012 at 12:48 PM, Mike Galbraithefa...@gmx.de wrote:
On Sat, 2012-11-03 at 04:33 -0700, Mike Galbraith wrote:
On Fri, 2012-11-02 at 21:09 +0100, Michal Zatloukal wrote:
Your nice 19 tasks receiving 'too much' CPU when there are
On 11/05/2012 09:31 AM, Jiri Kosina wrote:
I had a naive idea of just putting in-kernel verification of a complete
ELF binary passed to kernel by userspace, and if the signature matches,
jumping to it.
Would work for elf-x86_64 nicely I guess, but we'd lose a lot of other
functionality
On 11/06/2012 01:56 AM, Florian Weimer wrote:
Personally, I think the only way out of this mess is to teach users
how to disable Secure Boot.
If you're going to go that far, why not just get them to install a
RedHat (or SuSE, or Ubuntu, or whoever) key and use that instead?
Secure boot
On 11/07/2012 07:02 PM, Jon Mason wrote:
I'm not a lawyer, nor do I play one on TV, but if
I understand the GPL correctly, RTS only needs to provide the relevant
source to their customers upon request.
Not quite.
Assuming the GPL applies, and that they have modified the code, then
they must
On 10/25/2012 04:49 PM, Wallak wrote:
I've a very annoying behavior with the linux-3.6.x kernels release, and
a monolithic configuration. The USB 2.0 drives are mapped first with
/dev/sda, /dev/sdb... devices, and than the SATA AHCI drives come after.
This is out of order with the BIOS
On 10/26/2012 01:43 PM, Wallak wrote:
Chris Friesen wrote:
On 10/25/2012 04:49 PM, Wallak wrote:
I've a very annoying behavior with the linux-3.6.x kernels release, and
a monolithic configuration. The USB 2.0 drives are mapped first with
/dev/sda, /dev/sdb... devices, and than the SATA AHCI
On 10/18/2012 03:28 PM, Jan Kara wrote:
Yeah, ionice has its limitations. The problem is that all buffered
writes happen just into memory (so completely independently of ionice
settings). Subsequent writing of dirty memory to disk happens using flusher
thread which is a kernel process and it
Hi all,
I just had a quick question on read-write semaphore semantics. Suppose
someone holds a sema for reading, then someone else tries to aquire it
for writing, and blocks. Finally, a third code path tries to aquire it
for reading.
Does this third code path get the sema, or does it
Ingo Molnar wrote:
But, because you assert it that it's risky to criticise sched_yield()
too much, you sure must know at least one real example where it's right
to use it (and cite the line and code where it's used, with
specificity)?
It's fine to criticise sched_yield(). I agree that new
Ingo Molnar wrote:
* Chris Friesen [EMAIL PROTECTED] wrote:
However, there are closed-source and/or frozen-source apps where it's
not practical to rewrite or rebuild the app. Does it make sense to
break the behaviour of all of these?
See the background and answers to that in:
http
Daniel Hazelton wrote:
On Tuesday 04 September 2007 09:27:02 Krzysztof Halasa wrote:
Daniel Hazelton [EMAIL PROTECTED] writes:
US Copyright law. A copyright holder, regardless of what license he/she
may have released the work under, can still revoke the license for a
specific person or group
Hi,
We've got an unusual elf binary and we seem to be running into a bug in
the elf loader. I'm not an elf expert, so my apologies if I get the
terminology wrong.
The elf spec says that PT_LOAD segments must be ordered by vaddr. We
want to have a segment at a relatively low fixed vaddr.
Hi all,
We're running a modified 2.6.10 on a dual-Xeon system. We've had a
number of instances where we've seen oopses in the pipe code. I've
included the most recent one below. This bug left us with a hung
process as the pipe code bailed out while pipe_writev() was holding a
sema, and
Randy Dunlap wrote:
+Thunderbird (GUI)
+
+By default, thunderbird likes to mangle text, but there are ways to
+coerce it into being nice.
Can someone describe the problems with just attaching the patch in
Thunderbird? It's what Martin says he does on the linked document...
Chris
-
To
Jeff Garzik wrote:
Chris Friesen wrote:
Can someone describe the problems with just attaching the patch in
Thunderbird? It's what Martin says he does on the linked document...
Email clients don't like to quote attachments, even text/plain ones,
which then makes attached patches much more
We've run into an issue (on 2.6.10) where calling lsof triggers lost
packets on our server. Preempt is disabled, and NAPI is enabled.
It appears that for some reason the networking softirq is not being
handled in a timely fashion, which means that the rx ring buffer fills
up and packets
Lee Revell wrote:
On 7/20/07, Chris Friesen [EMAIL PROTECTED] wrote:
We've run into an issue (on 2.6.10) where calling lsof triggers lost
packets on our server. Preempt is disabled, and NAPI is enabled.
Can you reproduce with a recent kernel? Lots of latency issues have
been fixed since
Eric Dumazet wrote:
The problem is in established_get_next() and established_get_first() not
allowing softirq processing, while scanning a possibly huge hash table,
even if few sockets are hashed in.
As cond_resched_softirq() was added in linux-2.6.11, you probably *need*
to check the diffs
Li, Tong N wrote:
On the other hand, if locking does
become a problem for certain systems/workloads, increasing
sysctl_base_round_slice can reduce the locking frequency and alleviate
the problem, at the cost of being relatively less fair across the CPUs.
If locking does become a problem, it
Chris Snook wrote:
Concerns aside, I agree that fairness is important, and I'd really like
to see a test case that demonstrates the problem.
One place that might be useful is the case of fairness between resource
groups, where the load balancer needs to consider each group separately.
Now
Chris Snook wrote:
We have another SMP box that would benefit from group scheduling, but
we can't use it because the load balancer is not nearly good enough.
Which scheduler? Have you tried the CFS group scheduler patches?
CKRM as well.
Haven't tried the CFS group scheduler, as we're
Chris Snook wrote:
I don't think Chris's scenario has much bearing on your patch. What he
wants is to have a task that will always be running, but can't
monopolize either CPU. This is useful for certain realtime workloads,
but as I've said before, realtime requires explicit resource
Chris Snook wrote:
A fraction of *each* CPU, or a fraction of *total* CPU? Per-cpu
granularity doesn't make anything more fair.
Well, our current solution uses per-cpu weights, because our vendor
couldn't get the load balancer working accurately enough. Having
per-cpu weights and cpu
1 - 100 of 587 matches
Mail list logo