Re: USB extension (repeater) cable
Udo van den Heuvel wrote: Actually, what it looks like is even simpler. The extension cable contains a four-port hub chip (which is the most common commodity chip) and haven't bothered changing the descriptor to tell the computer only one port is actually active. So only one port can be activated, and the others are stubbed out in some evil way. In that case, it should be noisy but harmless. I will do some more testing then. Is there a way to get rid of the messages? No, but you don't have to care about them. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Realtime-preemption for 2.6.20-rc5 ?
Hello Sunil and Ingo, Date: 2007-01-20 02:56:40 GMT (20 hours and 26 minutes ago) > 2007-01-20, Sunil Naidu <[EMAIL PROTECTED]> wrote: > I did refer the same. Is it necessary to use only base kernel, say > 2.6.19? Or, can I go ahead with 2.6.19 + 2.6.19.2 patch + 2.6.19-rt > patch? > > If yes, any reason why we need to apply rt patch only to a base kernel? according to my observation 2.6.19-rt15 is based/includes 2.6.19.1 changes. But there has been that nasty clear_page_dirty_for_io() bug causing corruption of ext3. Even that I have tested more 2.6.20-rc + rt, I preffer to stay on "stable" kernel on boxes which I use daily until next stable appears. I have backported clear_page_dirty_for_io() to 2.6.19-rt15 and it worked fine. I have tried to update 2.6.19-rt15 to 2.6.19.2 base. There is result of my attempt Unofficial incremental patch from 2.6.19-rt15 to 2.6.19.2 + rt http://rtime.felk.cvut.cz/repos/ppisa-linux-devel/kernel-patches/current/patch-2.6.19.2-incr.patch Kernel seems to work correctly. I have checked the patch contents and I have not noticed any RT problematic changes in the code according to my dumb knowledge. I would be very happy, if Ingo would be so kind and could confirm my findings, because I am not sure, if final 2.6.20+rt would be ready before we need to prepare setup for our next semester classes at university. Best wishes Pavel Pisa e-mail: [EMAIL PROTECTED] www:http://cmp.felk.cvut.cz/~pisa work: http://www.pikron.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: USB extension (repeater) cable
H. Peter Anvin wrote: > Greg KH wrote: >> On Fri, Jan 19, 2007 at 04:40:34PM +0100, Udo van den Heuvel wrote: >>> >>> I just tried my shiny new usb extension cable (repeater): >>> >>> Jan 19 16:01:17 epia kernel: usb 5-1: new high speed USB device using >>> ehci_hcd and address 60 >>> Jan 19 16:01:17 epia kernel: usb 5-1: configuration #1 chosen from 1 >>> choice >>> Jan 19 16:01:17 epia kernel: hub 5-1:1.0: USB hub found >>> Jan 19 16:01:17 epia kernel: hub 5-1:1.0: 4 ports detected >>> Jan 19 16:01:18 epia kernel: hub 5-1:1.0: Cannot enable port 1. Maybe >>> the USB cable is bad? >>> Jan 19 16:01:22 epia last message repeated 3 times >>> Jan 19 16:01:23 epia kernel: hub 5-1:1.0: Cannot enable port 2. Maybe >>> the USB cable is bad? >>> Jan 19 16:01:26 epia last message repeated 3 times >>> Jan 19 16:01:27 epia kernel: hub 5-1:1.0: Cannot enable port 3. Maybe >>> the USB cable is bad? >>> Jan 19 16:01:31 epia last message repeated 3 times [...] > Actually, what it looks like is even simpler. The extension cable > contains a four-port hub chip (which is the most common commodity chip) > and haven't bothered changing the descriptor to tell the computer only > one port is actually active. So only one port can be activated, and the > others are stubbed out in some evil way. In that case, it should be > noisy but harmless. I will do some more testing then. Is there a way to get rid of the messages? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: USB extension (repeater) cable
Greg KH wrote: On Fri, Jan 19, 2007 at 04:40:34PM +0100, Udo van den Heuvel wrote: Hello, I just tried my shiny new usb extension cable (repeater): Jan 19 16:01:17 epia kernel: usb 5-1: new high speed USB device using ehci_hcd and address 60 Jan 19 16:01:17 epia kernel: usb 5-1: configuration #1 chosen from 1 choice Jan 19 16:01:17 epia kernel: hub 5-1:1.0: USB hub found Jan 19 16:01:17 epia kernel: hub 5-1:1.0: 4 ports detected Jan 19 16:01:18 epia kernel: hub 5-1:1.0: Cannot enable port 1. Maybe the USB cable is bad? Jan 19 16:01:22 epia last message repeated 3 times Jan 19 16:01:23 epia kernel: hub 5-1:1.0: Cannot enable port 2. Maybe the USB cable is bad? Jan 19 16:01:26 epia last message repeated 3 times Jan 19 16:01:27 epia kernel: hub 5-1:1.0: Cannot enable port 3. Maybe the USB cable is bad? Jan 19 16:01:31 epia last message repeated 3 times The second cable does the same. Of course we have just one port on this hub... Any ideas? Perhaps the kernel is not lying and this cable really is bad? :) Your hardware can not handle this device, there really is nothing that the kernel can do about this. USB extension cables are horrible things, and usually violate the USB spec and do not always work, as you are finding out. Sorry about that. Actually, what it looks like is even simpler. The extension cable contains a four-port hub chip (which is the most common commodity chip) and haven't bothered changing the descriptor to tell the computer only one port is actually active. So only one port can be activated, and the others are stubbed out in some evil way. In that case, it should be noisy but harmless. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: KB->KiB, MB -> MiB, ... (IEC 60027-2)
David Schwartz wrote: Talk about a cure worse than the disease! So you're saying that 256MB flash cards could be advertised as having 268.4MB? A 512MB RAM stick is mislabelled and could correctly say 536.8MB? That's just plain craziness. Adopting IEC 60027-2 just replaces a set of well-understood problems with all new problems. Except that you're wrong above. Most 512 MB flash cards are less than 512 MiB; most of them are, in fact, around 512 MB! RAM, of course, is consistently 512 MiB. This little tidbit discovered in the process of working on an application which required powers-of-two flash cards, and finding that one does have to use one size larger... -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Serial port blues
Joe Barr wrote: I'm forwarding this post by the author of a great little program for digital amateur radio on Linux, because I'm curious whether or not the problem he is seeing can be resolved outside the kernel. All comments welcome on/off list. Thanks, Joe Barr K1GPL [...] I've spent the last day staring at the oscilloscope and pins RTS and DTR on the serial output for 4 different computers running 4 different versions of Linux. Also have exhausted the search on the internet for information regarding both the latency and jitter associated with ioctl calls to the serial driver (both ttyS and ttyUSB). I'm sure it is out there somewhere, I just cannot find it. I am now convinced that the current serial port drivers available to us on the Linux platform WILL NOT support CW and/or RTTY that is software generated in a satisfactory manner. To test the latency and jitter of the ioctl calls to set or clear RTS and / or DTR I built a basic square wave generator with microsecond timing precision. The timing could be derived either from the select system call or by controlled i/o to the sound card. Both provide very precise timing of the program loop. Each time through the loop either the RTS/DTR was set or cleared. The timing jitter for each 1/2 cycle was from 0 to +4 msec. This varied between systems as each had different cpu clock rates. The jitter is caused by the asynchronous response of the kernel to the request to control the port. ioctl requests apparantly do not have a very high priority for the kernel. They are probably just serviced by a first-in first-out interrupt service request loop. That type of jitter is tolerable up to about 20 wpm CW. It totally wipes out the ability to generate an FSK signal on the DTR or RTS pin. Okay, here he's using bit-banging of the DTR and RTS pins to generate a fairly high precision output wave. This is not really the Direct access to the serial port(s) is a kernel perogative in Linux. Only kernel level drivers are allowed such port access. So write a kernel driver. It's not like we're locking anybody out. There is certainly enough Amateur Radio/Linux crossover that a kernel enhancement to support Amateur Radio is going to get frowned upon. So ... bottom line is that all of my attempts over the past couple of months to provide CW and / or FSK output signal have been to fraught with pitfalls. The CW seems OK for slow speed keying, but the FSK seems impossible to achieve. The FSK using the UART is also limited by the Linux operating system and the current drivers. That limitation excludes the use of 45 or 56 baud BAUDOT. That is true at the moment (due to unfortunate design choices made early on), but this is already in the process of being changed: http://lkml.org/lkml/2006/10/18/280 Until such time as new information becomes available I am going to comment out all references to CW and / or FSK via RTS/DTR. I also question how useful the FSK on TxD (UART derived) might be to most users since the 45.45 baudrate is not available in the serial port driver. That function will also be commented out. All this should not really come as a surprise since Linux is not a real-time operating system. By the way, I did try the tests with the test program running with nice -20. Not much difference. See again how he should be using real-time priority rather than nice -20. Sorry folks, but we win some and lose some. 73, Dave, W1HKJ -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Serial port blues
Joe Barr wrote: I'm forwarding this post by the author of a great little program for digital amateur radio on Linux, because I'm curious whether or not the problem he is seeing can be resolved outside the kernel. All comments welcome on/off list. Thanks, Joe Barr K1GPL [...] I've spent the last day staring at the oscilloscope and pins RTS and DTR on the serial output for 4 different computers running 4 different versions of Linux. Also have exhausted the search on the internet for information regarding both the latency and jitter associated with ioctl calls to the serial driver (both ttyS and ttyUSB). I'm sure it is out there somewhere, I just cannot find it. I am now convinced that the current serial port drivers available to us on the Linux platform WILL NOT support CW and/or RTTY that is software generated in a satisfactory manner. To test the latency and jitter of the ioctl calls to set or clear RTS and / or DTR I built a basic square wave generator with microsecond timing precision. The timing could be derived either from the select system call or by controlled i/o to the sound card. Both provide very precise timing of the program loop. Each time through the loop either the RTS/DTR was set or cleared. The timing jitter for each 1/2 cycle was from 0 to +4 msec. This varied between systems as each had different cpu clock rates. The jitter is caused by the asynchronous response of the kernel to the request to control the port. ioctl requests apparantly do not have a very high priority for the kernel. They are probably just serviced by a first-in first-out interrupt service request loop. That type of jitter is tolerable up to about 20 wpm CW. It totally wipes out the ability to generate an FSK signal on the DTR or RTS pin. Okay, here he's using bit-banging of the DTR and RTS pins to generate a fairly high precision output wave. These bits are being used as GPIOs, and would need very precise control. This is much worse for USB serial ports than for ordinary serial ports. Direct access to the serial port(s) is a kernel perogative in Linux. Only kernel level drivers are allowed such port access. So write a kernel driver. It's not like we're locking anybody out. There is certainly enough Amateur Radio/Linux crossover that a kernel enhancement to support Amateur Radio is going to get frowned upon. So ... bottom line is that all of my attempts over the past couple of months to provide CW and / or FSK output signal have been to fraught with pitfalls. The CW seems OK for slow speed keying, but the FSK seems impossible to achieve. The FSK using the UART is also limited by the Linux operating system and the current drivers. That limitation excludes the use of 45 or 56 baud BAUDOT. That is true at the moment (due to unfortunate design choices made early on), but this is already in the process of being changed: http://lkml.org/lkml/2006/10/18/280 Until such time as new information becomes available I am going to comment out all references to CW and / or FSK via RTS/DTR. I also question how useful the FSK on TxD (UART derived) might be to most users since the 45.45 baudrate is not available in the serial port driver. That function will also be commented out. All this should not really come as a surprise since Linux is not a real-time operating system. By the way, I did try the tests with the test program running with nice -20. Not much difference. See again how he should be using real-time priority rather than nice -20. Sorry folks, but we win some and lose some. 73, Dave, W1HKJ -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Serial port blues
On Sun, Jan 21, 2007 at 12:54:56AM -0500, Theodore Tso wrote: > On Sat, Jan 20, 2007 at 06:36:44PM +0100, Willy Tarreau wrote: > > On Fri, Jan 19, 2007 at 03:37:34PM -0600, Joe Barr wrote: > > > > > > I'm forwarding this post by the author of a great little program for > > > digital amateur radio on Linux, because I'm curious whether or not the > > > problem he is seeing can be resolved outside the kernel. > > > > At least, I see one wrong claim and one unexplored track in his report. > > The wrong claim : the serial port can only be controled by the kernel. > > It is totally wrong for true serial ports. If he does not want to use > > ioctl(), then he can directly program the I/O port. > > There's more wrong with his claim than just that. Another wrong claim > is that it's caused by the Linux kernel not treating ioctl requests > with high priority. Of course that's nonsense. It might be the case > if we were using brain-damaged messaging-passing approach like what > Andrew Tenenbaum is proposing with Minix 3.1, but in Linux, the serial > port DTR/CTS lines are toggled as soon as the userspace executes the > ioctl. Damn you're right. It shocked me too and I know I was missing something when replying but I did not remember what. > The real issue is when does the userspace program get a chance to run. > He's using the select() system call, which only guarantees accuracy up > to the granularity of the system clock. Given that he's reporting a > jitter of between 0 and 4ms, I'm guessing that he's running with a > system clock tick of 250HZ (since 1/250 == 4ms ). Yes, that's what I thought too. In the past, I've been having better resolution with select() and real-time scheduling, but I cannot reliably reproduce it, even on SMP. I remember nothing was running at all on the machine (not even X) and that can make an important difference. But as you say, there will be no guarantee of better accuracy anyway. > So if he wants accuracy greater than that, there are a couple of > things he can do. One is to recompile his kernel with HZ=1000. That > will give him accuracy up to 1ms or so. If he needs better than 1ms > granularity, there are two options. One is use sched_setscheduler() > to enable posix soft-realtime, and then calibrate a busy loop. This > will of course burn power and completely busy out one CPU, so if he > needs to run CW continuously this probably isn't a great solution. On > an SMP system it might work, although it is obviously a huge kludge. Hmmm the busy loop is dirty as hell, even on SMP, but it works ;-) I remember is was possible to reprogram the RTC to interrupt at 8192 Hz. If the task is running with real time prio, it should get this accuracy, or am I mistaken ? > The other choice would be to install Ingo's -rt patches (see > http://rt.wiki.kernel.org for more information), and then use the > Posix high-resolution timer API's (i.e., timer_create, et. al). Make > sure you enable CONFIG_HIGH_RES_TIMERS after you apply the patch. It > would also be a good idea to set a real-time scheduling priority for > the application, to make sure that when the timer goes off, the > process doesn't get preempted by some background cron job. > > > Now he must be careful about avoiding busy loops in the rest of the > > program, or he will have to use the reset button. > > An easy way of dealing with this is to have an sshd running > an alternative port running at a nice high priority (say, prio 95 or > so). That way, if you screw up, you can always login remotely and > kill the offending program. > > There is also a RT Watchdog program which can be found on > rt.wiki.kernel.org which can be used to recover from runaway real-time > processes without needing to hit the reset button. Thanks for those hints, I've been used to play with the reset button, at least it has forced me to double check my code before running it :-) > Finally, please feel free to direct your amateur radio friend to the > [EMAIL PROTECTED] There are plenty of folks there who > would be very happy to help him out. > > 73 de Ted, N1ZSU Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
Björn Steinbrink wrote: On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote: Robert Hancock wrote: change in 2.6.20-rc is either causing or triggering this problem. It would be useful if you could try git bisect between 2.6.19 and 2.6.20-rc5, keeping the latest sata_nv.c each time, and see if that Yes, 'git bisect' would be the next step in figuring out this puzzle. Anybody up for it? I'll go for it, but could I get an explanation how that could lead to a different result than my last bisection? I see the difference of keeping sata_nv.c but my brain can't wrap around it right now (woke up in the middle of the night and still not up to speed...). Whatever the problem is, only seems to show up when ADMA is enabled, and so the patch that added ADMA support shows up as the culprit from your git bisect. However, from what Chr is reporting, 2.6.19 with the ADMA support added in doesn't seem to have the problem, so presumably something else that changed in the 2.6.20-rc series is triggering it. Doing a bisect while keeping the driver code itself the same will hopefully identify what that change is.. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Serial port blues
On Sat, Jan 20, 2007 at 06:36:44PM +0100, Willy Tarreau wrote: > On Fri, Jan 19, 2007 at 03:37:34PM -0600, Joe Barr wrote: > > > > I'm forwarding this post by the author of a great little program for > > digital amateur radio on Linux, because I'm curious whether or not the > > problem he is seeing can be resolved outside the kernel. > > At least, I see one wrong claim and one unexplored track in his report. > The wrong claim : the serial port can only be controled by the kernel. > It is totally wrong for true serial ports. If he does not want to use > ioctl(), then he can directly program the I/O port. There's more wrong with his claim than just that. Another wrong claim is that it's caused by the Linux kernel not treating ioctl requests with high priority. Of course that's nonsense. It might be the case if we were using brain-damaged messaging-passing approach like what Andrew Tenenbaum is proposing with Minix 3.1, but in Linux, the serial port DTR/CTS lines are toggled as soon as the userspace executes the ioctl. The real issue is when does the userspace program get a chance to run. He's using the select() system call, which only guarantees accuracy up to the granularity of the system clock. Given that he's reporting a jitter of between 0 and 4ms, I'm guessing that he's running with a system clock tick of 250HZ (since 1/250 == 4ms ). So if he wants accuracy greater than that, there are a couple of things he can do. One is to recompile his kernel with HZ=1000. That will give him accuracy up to 1ms or so. If he needs better than 1ms granularity, there are two options. One is use sched_setscheduler() to enable posix soft-realtime, and then calibrate a busy loop. This will of course burn power and completely busy out one CPU, so if he needs to run CW continuously this probably isn't a great solution. On an SMP system it might work, although it is obviously a huge kludge. The other choice would be to install Ingo's -rt patches (see http://rt.wiki.kernel.org for more information), and then use the Posix high-resolution timer API's (i.e., timer_create, et. al). Make sure you enable CONFIG_HIGH_RES_TIMERS after you apply the patch. It would also be a good idea to set a real-time scheduling priority for the application, to make sure that when the timer goes off, the process doesn't get preempted by some background cron job. > Now he must be careful about avoiding busy loops in the rest of the > program, or he will have to use the reset button. An easy way of dealing with this is to have an sshd running an alternative port running at a nice high priority (say, prio 95 or so). That way, if you screw up, you can always login remotely and kill the offending program. There is also a RT Watchdog program which can be found on rt.wiki.kernel.org which can be used to recover from runaway real-time processes without needing to hit the reset button. Finally, please feel free to direct your amateur radio friend to the [EMAIL PROTECTED] There are plenty of folks there who would be very happy to help him out. 73 de Ted, N1ZSU - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Serial port blues
From: On Behalf Of Joe Barr > I'm forwarding this post by the author of a great little program for > digital amateur radio on Linux, because I'm curious whether or not the > problem he is seeing can be resolved outside the kernel. From: w1hkj [mailto:[EMAIL PROTECTED] > I am now convinced that the current serial port drivers > available to us > on the Linux platform WILL NOT support CW and/or RTTY that is > software > generated in a satisfactory manner. I don't know what FSK/CW/RTTY/BAUDOT is. > Direct access to the serial port(s) is a kernel perogative in Linux. > Only kernel level drivers are allowed such port access. Not true. > Until such time as new information becomes available I am going to > comment out all references to CW and / or FSK via RTS/DTR. I also > question how useful the FSK on TxD (UART derived) might be to > most users > since the 45.45 baudrate is not available in the serial port driver. > That function will also be commented out. You may be confusing the old-style baud rate lookup table (B9600 et al) with the actual capabilities of the serial port. The lookup table has a limited number of baud rates. You can get more rates than that using a custom divisor. Most motherboard-based serial ports are clocked at 1.8432 Mhz. The UART does 16 samples per bit time, yielding a max baud rate of 115200. 115200 / 25 yields 4608, which is a 1.4% error rate from 4545. This may or may not be acceptable. 115200 / 2535 yields 45.44, which is a 0.01% error rate, which is likely acceptable. > Sorry folks, but we win some and lose some. We make serial port boards with very flexible UARTS. 4545 exactly is achievable. 45.45 too. Linux supported. ..Stu -- We make multiport serial products. http://www.connecttech.com/ (800) 426-8979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote: > Robert Hancock wrote: > >change in 2.6.20-rc is either causing or triggering this problem. It > >would be useful if you could try git bisect between 2.6.19 and > >2.6.20-rc5, keeping the latest sata_nv.c each time, and see if that > > > Yes, 'git bisect' would be the next step in figuring out this puzzle. > > Anybody up for it? I'll go for it, but could I get an explanation how that could lead to a different result than my last bisection? I see the difference of keeping sata_nv.c but my brain can't wrap around it right now (woke up in the middle of the night and still not up to speed...). Thanks, Björn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 9/12] repost: cciss: add busy_configuring flag
On Fri, Jan 19 2007, dann frazier wrote: > On Wed, Dec 13, 2006 at 01:52:36PM +0100, Jens Axboe wrote: > > On Tue, Dec 12 2006, Mike Miller (OS Dev) wrote: > > > On Mon, Nov 06, 2006 at 09:32:00PM +0100, Jens Axboe wrote: > > > > On Mon, Nov 06 2006, Mike Miller (OS Dev) wrote: > > > > > PATCH 9 of 12 > > > > > > > > > > This patch adds a check for busy_configuring to prevent starting a > > > > > queue > > > > > on a drive that may be in the midst of updating, configuring, > > > > > deleting, etc. > > > > > > > > > > This had a test for if the queue was stopped or plugged but that > > > > > seemed > > > > > to cause issues. > > > > > Please consider this for inclusion. > > > > > > > > > > Thanks, > > > > > mikem > > > > > > > > > > Signed-off-by: Mike Miller <[EMAIL PROTECTED]> > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > > drivers/block/cciss.c |5 - > > > > > 1 files changed, 4 insertions(+), 1 deletion(-) > > > > > > > > > > diff -puN drivers/block/cciss.c~cciss_busy_conf_for_lx2619-rc4 > > > > > drivers/block/cciss.c > > > > > --- linux-2.6/drivers/block/cciss.c~cciss_busy_conf_for_lx2619-rc4 > > > > > 2006-11-06 13:27:53.0 -0600 > > > > > +++ linux-2.6-root/drivers/block/cciss.c 2006-11-06 > > > > > 13:27:53.0 -0600 > > > > > @@ -1190,8 +1190,11 @@ static void cciss_check_queues(ctlr_info > > > > > /* make sure the disk has been added and the drive is > > > > > real > > > > >* because this can be called from the middle of > > > > > init_one. > > > > >*/ > > > > > - if (!(h->drv[curr_queue].queue) || > > > > > !(h->drv[curr_queue].heads)) > > > > > + if (!(h->drv[curr_queue].queue) || > > > > > + !(h->drv[curr_queue].heads) || > > > > > + h->drv[curr_queue].busy_configuring) > > > > > continue; > > > > > + > > > > > blk_start_queue(h->gendisk[curr_queue]->queue); > > > > > > > > This is racy, because you don't start the queue when you unset > > > > ->busy_configuring later on. For this to be safe, you need to call > > > > blk_start_queue() when you set ->busy_configuring to 0. > > > > > > Jens, please see Chase's reply to your concerns: > > > > busy_configuring - I do not think this is racy. This > > > > flag is used only when we are removing/deleting a disk. In > > > > this case the queue is cleaned up and the disk is deleted. > > > > If we are doing that then there is no queue to start later. > > > > The check of this flag in the interrupt handler is to prevent > > > > us from trying to start a queue that is in the middle of > > > > being deleted. This flag could be called busy_deleting. > > > > Ok, no worries then if it's simply a going away flag. I wonder if it's > > needed at all, but it certainly doesn't hurt. > > hey Jens, > Just a poke since I haven't seen this change go into your block > tree. Is it still in-plan? I had it stashed for a 2.6.21 merge, it'll go in by then. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Saturday 20 January 2007 22:41, Stephen Clark wrote: >Willy Tarreau wrote: >>On Sat, Jan 20, 2007 at 02:56:20PM -0500, Stephen Clark wrote: >>>Sunil Naidu wrote: On 1/20/07, Willy Tarreau <[EMAIL PROTECTED]> wrote: >It is not expected to increase write performance, but it should help >you do something else during that time, or also give more > responsiveness to Ctrl-C. It is possible that you have fast and > slow RAM, or that your video card uses shared memory which slows > down some parts of memory which are not used anymore with those > parameters. I did test some SATA drives, am getting these value for 2.6.20-rc5:- [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 21.0962 seconds, 50.9 MB/s What can you suggest here w.r.t my RAM & disk? >Willy Thanks, ~Akula2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ >>> >>>Hi, >>>whitebook vbi s96f core 2 duo t5600 2gb hitachi ATA >>> HTS721060G9AT00 using libata >>>time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 >>>1024+0 records in >>>1024+0 records out >>>1073741824 bytes (1.1 GB) copied, 10.0092 seconds, 107 MB/s >>> >>>real0m10.196s >>>user0m0.004s >>>sys 0m3.440s >> >>You have too much RAM, it's possible that writes did not complete >> before the end of your measurement. Try this instead : >> >>$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync >> >>Willy > >Yeah that make a difference: > time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync >1024+0 records in >1024+0 records out >1073741824 bytes (1.1 GB) copied, 8.86719 seconds, 121 MB/s > >real0m43.601s >user0m0.004s >sys 0m3.912s I'd reconsider my new years resolutions for figures like that: #> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 24.1455 seconds, 44.5 MB/s real0m25.218s user0m0.009s sys 0m5.763s but then I also have only a gig of ram. So does this look normal? -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Yahoo.com and AOL/TW attorneys please note, additions to the above message by Gene Heskett are: Copyright 2007 by Maurice Eugene Heskett, all rights reserved. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to flush the disk write cache from userspace
On Thu, Jan 18 2007, Robert Hancock wrote: > Ricardo Correia wrote: > >On Tuesday 16 January 2007 00:38, you wrote: > >>As always with these things, the devil is in the details. It requires > >>the device to support a ->prepare_flush() queue hook, and not all > >>devices do that. It will work for IDE/SATA/SCSI, though. In some devices > >>you don't want/need to do a real disk flush, it depends on the write > >>cache settings, battery backing, etc. > > > >Is there any chance that someone could implement this (I don't have the > >skills, unfortunately)? Maybe add a new ioctl() to block devices, so that > >it doesn't break any existing code? > > I think we really should have support for doing cache flushes > automatically on fsync, etc. User space code should not have to worry > about this problem, it's pretty silly that for example MySQL has to > advise people to use hdparm -W 0 to disable the write cache on their IDE > drives in order to get proper data integrity guarantees - and disabling > the cache on IDE without command queueing really slaughters the > performance, unnecessarily in this case. Completely agree. If you have barriers enabled in your filesystem, then it should Just Work when you do fsync(). At least that is the case for reiserfs and XFS, I'm not completely sure that ext3 also handles it correctly. For direct block device access, fsync() does need to provide a commit to stable storage as well though. > There may be some cases where the controller provides a battery-backed > cache and thus we don't want to actually force the controller to flush > everything out to the drive on fsync, so we may need to be able to > disable this, but these controllers may ignore flushes anyway. I know > IBM ServeRAID appears to fail requests for write cache info and so the > kernel assumes drive cache: write through and doesn't do any flushes. That would be the preferable approach, just have the hardware that doesn't need a flush ignore the FLUSH_CACHE. That would also need to ignore the FUA bit on writes then. I'm not sure what the spec has to say on this, basically the requirement is just that data is on stable storage (eg survives power failure and so on), then that would be fine. And I would hope it is, it'd be hard to specify anything else. > >I believe it's a very useful (and relatively simple) feature that > >increases data integrity and reliability for applications that need this > >functionality. > > > >I think it must be considered that most people have disk write caches > >enabled and are using IDE, SATA or SCSI disks. > > > >I also think there's no point in disabling disks' write caches, since it > >slows writes and decreases disks' lifetime, and because there's a better > >solution. > > Yes, ideally doing all writes to the drive with write cache enabled and > then flushing them out afterwards would be much more efficient (at least > when no command queueing is involved) since the drive can choose what > order to complete the writes in. That only works if you just care about the stream of writes going to stable storage and don't care about ordering. But the above is essentially how the barriers work on write back cache + non queued devices. When the barrier write is received, we commit the previous writes first with a flush and then write the barrier (followed by another flush, or possibly not if we have FUA). -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
Willy Tarreau wrote: On Sat, Jan 20, 2007 at 02:56:20PM -0500, Stephen Clark wrote: Sunil Naidu wrote: On 1/20/07, Willy Tarreau <[EMAIL PROTECTED]> wrote: It is not expected to increase write performance, but it should help you do something else during that time, or also give more responsiveness to Ctrl-C. It is possible that you have fast and slow RAM, or that your video card uses shared memory which slows down some parts of memory which are not used anymore with those parameters. I did test some SATA drives, am getting these value for 2.6.20-rc5:- [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 21.0962 seconds, 50.9 MB/s What can you suggest here w.r.t my RAM & disk? Willy Thanks, ~Akula2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ Hi, whitebook vbi s96f core 2 duo t5600 2gb hitachi ATA HTS721060G9AT00 using libata time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 10.0092 seconds, 107 MB/s real0m10.196s user0m0.004s sys 0m3.440s You have too much RAM, it's possible that writes did not complete before the end of your measurement. Try this instead : $ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync Willy Yeah that make a difference: time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 8.86719 seconds, 121 MB/s real0m43.601s user0m0.004s sys 0m3.912s -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
Robert Hancock wrote: change in 2.6.20-rc is either causing or triggering this problem. It would be useful if you could try git bisect between 2.6.19 and 2.6.20-rc5, keeping the latest sata_nv.c each time, and see if that Yes, 'git bisect' would be the next step in figuring out this puzzle. Anybody up for it? Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Register the bus, vendor and product IDs for dvb-usb remote device
Hi, This patch writes the USB vendor and product IDs into the /sys/class/input/inputX/id/ files, so that udev can find them. A rule like this does the trick for me: KERNEL="event*", SYSFS{../id/vendor}=="2040", SYSFS{../id/product}=="9301", SYMLINK+="input/dvb-remote" --- linux-2.6.18/drivers/media/dvb/dvb-usb/dvb-usb-remote.c.old 2007-01-21 02:43:11.0 + +++ linux-2.6.18/drivers/media/dvb/dvb-usb/dvb-usb-remote.c 2007-01-21 02:39:02.0 + @@ -107,6 +107,9 @@ d->rc_input_dev->keycodemax = KEY_MAX; d->rc_input_dev->name = "IR-receiver inside an USB DVB receiver"; d->rc_input_dev->phys = d->rc_phys; + d->rc_input_dev->id.bustype = BUS_USB; + d->rc_input_dev->id.vendor = d->udev->descriptor.idVendor; + d->rc_input_dev->id.product = d->udev->descriptor.idProduct; /* set the bits for the keys */ deb_rc("key map size: %d\n", d->props.rc_key_map_size); Cheers, Chris ___ What kind of emailer are you? Find out today - get a free analysis of your email personality. Take the quiz at the Yahoo! Mail Championship. http://uk.rd.yahoo.com/evt=44106/*http://mail.yahoo.net/uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Undo some of the pseudo-security madness
On Sat, 2007-01-20 at 17:37 +0300, Samium Gromoff wrote: > This patch removes the dropping of ADDR_NO_RANDOMIZE upon execution of setuid > binaries. > > Why? The answer consists of two parts: > > Firstly, there are valid applications which need an unadulterated memory map. > Some of those which do their memory management, like lisp systems (like SBCL). > They try to achieve this by setting ADDR_NO_RANDOMIZE and reexecuting > themselves. this is a ... funny way of achieving this if an application for some reason wants some fixed address for a piece of memory there are other ways to do that (but to some degree all apps that can't take randomization broken; for example a glibc upgrade on a system will also move the address space around by virtue of being bigger or smaller etc etc) > [1]. See the excellent, 'Hackers Hut' by Andries Brouwer, which describes > how AS randomisation can be got around by the means of linux-gate.so.1 got a URL to this? If this is exploiting the fact that the vdso is at a fixed spot... it's no longer the case since quite a while. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible ways of dealing with OOM conditions.
> On Sat, Jan 20, 2007 at 05:36:03PM -0500, Rik van Riel ([EMAIL PROTECTED]) > wrote: > > Due to the way everything in the kernel works, you cannot > > prevent the memory allocator from allocating everything and > > running out, except maybe by setting aside reserves to deal > > with special subsystems. As a technical side gets described, this is exactly the way I proposed - there is special dedicated pool which does not depend on main system allocator, so if the latter is empty, the former still _can_ work, although it is possible that it will be empty too. Separation. It removes avalanche effect when one problem produces several different. I do not say that some allocator is the best for dealing with such situation, I just pointed that critical pathes were separated in NTA, so they do not depend on each one's failure. Actually that separation was introduced way too long ago with memory pools, this is some kind of continuation, which adds a lot of additional extremely useful features. NTA used for network allocations is that pool, since in real life packets can not be allocated in advance without memory overhead. For simple situations like only ACK generatinos it is possible, which I suggested first, but long-term solution is special allocator. I selected NTA for this task because it has _additional_ features like self-deragmentation, which is very useful part for networking, but if only OOM recovery condition is concerned, then actually any other allocator can be used of course. -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
OnStream DI30: undescriptive message: CoD != 0 in idescsi_pc_intr
OnStream Di30 (using ide-scsi and osst drivers), when reading or writing I regularly get these kernel messages: <3>ide-scsi: CoD != 0 in idescsi_pc_intr Let's assume flaky hardware; nothing we can hold the kernel to blame for (which is 2.6.19.1) -- it's a good thing it's calling our attention. There's no data corruption, btw. However, said message is quite useless because undescriptive and too terse. bjd - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
Chr wrote: Could you (or anyone else) test what happens if you take the 2.6.20-rc5 version of sata_nv.c and try it on 2.6.19? That would tell us whether it's this change or whether it's something else (i.e. in libata core). Ok, did that! (got a fresh 2.6.19 tar ball, and used 2.6.20-rc5' sata_nv.c with the oneliner in libata_sff.c) And surprise after one hour uptime, there is not even one sata exceptions in dmesg! (I'll report back tomorrow...) That is interesting, indeed.. If that holds up then I assume some other change in 2.6.20-rc is either causing or triggering this problem. It would be useful if you could try git bisect between 2.6.19 and 2.6.20-rc5, keeping the latest sata_nv.c each time, and see if that gives any indication. If not, just trying some of the different 2.6.20-rcX versions may be useful. Before that, though, can you try making this change I suggested below in 2.6.20-rc5 and see if the problem still shows up? Assuming that still doesn't work, can you then try removing these lines from nv_host_intr in 2.6.20-rc5 sata_nv.c and see what that does? /* bail out if not our interrupt */ if (!(irq_stat & NV_INT_DEV)) return 0; as that's the difference I'm most suspicious of causing the problem. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible ways of dealing with OOM conditions.
On Sat, Jan 20, 2007 at 05:36:03PM -0500, Rik van Riel ([EMAIL PROTECTED]) wrote: > Evgeniy Polyakov wrote: > >On Fri, Jan 19, 2007 at 01:53:15PM +0100, Peter Zijlstra > >([EMAIL PROTECTED]) wrote: > > >>>Even further development of such idea is to prevent such OOM condition > >>>at all - by starting swapping early (but wisely) and reduce memory > >>>usage. > >>These just postpone execution but will not avoid it. > > > >No. If system allows to have such a condition, then > >something is broken. It must be prevented, instead of creating special > >hacks to recover from it. > > Evgeniy, you may want to learn something about the VM before > stating that reality should not occur. I.e. I should start believing that OOM can not be prevented, bugs can not be fixed and things can not be changed just because it happens right now? That is why I'm not subscribed to lkml :) > Due to the way everything in the kernel works, you cannot > prevent the memory allocator from allocating everything and > running out, except maybe by setting aside reserves to deal > with special subsystems. > > As for your "swapping early and reduce memory usage", that is > just not possible in a system where a memory writeout may need > one or more memory allocations to succeed and other I/O paths > (eg. file writes) can take memory from the same pools. When system starts swapping only when it can not allocate new page, then it is broken system. I bet you get warm closing way before you hands are frostbitten, and you do not have a liter of alcohol in the packet for such emergency. And to get warm closing you still need to go over cold street into the shop, but you will do it before weather becomes arctic. > With something like iscsi it may be _necessary_ for file writes > and swap to take memory from the same pools, because they can > share the same block device. Of course swapping can require additional allocation, when it happens over network it is quite obvious. The main problem is the fact, that if system was put into the state, when its life depends on the last possible allocation, then it is broken. There is a light connected to car's fuel tank which starts blinking, when amount of fuel is less then predefined level. Car just does not stop suddenly and starts to get fuel from reserve (well eventually it stops, but it says about problem long before it dies). > Please get out of your fantasy world and accept the constraints > the VM has to operate under. Maybe then you and Peter can agree > on something. I can not accept the situation, when problem is not fixed, but instead recovery path is added. There must be both ways of dealing with it - emergency force majeur recovery and preventive steps. What we are talking about (except pointing to obvious things and sending to school-classes), at least how I see this, is ways of dealing with possible OOM condition. If OOM has happend, then there must be recovery path, but OOM must be prevented, and ways to do this were described too. > -- > Politics is the struggle between those who want to make their country > the best in the world, and those who believe it already is. Each group > calls the other unpatriotic. -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Sleeping function called from invalid context (2.6.18.6)
This is for a 2.6.18.6 UP-preempt kernel compiled with gcc-4.1.1, BTW. Cheers, Chris ___ The all-new Yahoo! Mail goes wherever you go - free your email address from your Internet provider. http://uk.docs.yahoo.com/nowyoucan.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Running Linux on FPGA
On Sat, Jan 20, 2007 at 11:42:37PM +, sathesh babu wrote: > Hi, > I am trying to run Linux-2.6.18.2 ( with preemption enable) kernel on FPGA > board which has MIPS24KE processor runs at 12 MHZ. Programmed the timer to > give interrupt at every 10msec. > I am seeing some inconsistence behavior during boot up processor. Some > times it stops after "NET: Registered protocol family 17" and "VFS: Mounted > root (jffs2 filesystem).". > Could some give some pointers why the behavior is random. > Is it OK to program the timer to 10 msec? or should it be more. The overhead of timer interrupts at this low clockrate is significant so I recommend to minimize the timer interrupt rate as far as possible. This is really a tradeoff between latency and overhead and matters much less on hardcores which run at hundreds of MHz. For power sensitive applications lowering the interrupt rate can also help. And that's alredy pretty much what you need to know, that is a 10ms timer is fine. Btw, is this coincidentally on a CoreFPGA 2 or 3 CPU card on a Malta board? Ralf - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: KB->KiB, MB -> MiB, ... (IEC 60027-2)
> "David" == David Schwartz <[EMAIL PROTECTED]> writes: David> The way RAM and flash are measured is correct. In my experience, RAM and flash *drives* are measured differently. I understand that individual flash chips come in powers of 2, but by the time they're packaged as a "flash drive", some of that has been used up -- yet they're still sold as the full capacity, and the manufacturers use the confusion between MiB and MB to defend the practice. This "16Mb" drive doesn't really have 16 megabytes of capacity - it's really got 15.5. But that's just standard operating procedure for storage manufacturers. Non-volatile storage manufacturers, including hard drive companies, like to define a megabyte as 1,000,000 bytes and a gigabyte as 1,000,000,000 bytes. They're actually two-to-the-power-of-20 and two-to-the-power-of-30 bytes, which is 1,048,576 and 1,073,741,824 bytes respectively. This is the main reason why a "20Gb" hard drive won't actually give you 20Gb of capacity. In flash RAM devices, things can get a bit more complex again, thanks to the small amount of memory which may be used by the device itself for housekeeping. That can vary between device families; a CompactFlash card with a given nominal capacity may actually have a bit less space than a SmartMedia card with the same number on the label. And manufacturers may throw in some more memory to push the real capacity up closer to the stated one, which is what they've done with the USBDrive. It's still about three per cent shy of its claimed capacity, though. -- http://www.dansdata.com/flashcomp.htm (E.g., my "512MB" CF card shows up as "487MB" in the camera -- a difference of exactly 5%, as would be expected by the MB-vs-MiB scam. I'd be happier if the camera said "487MiB", but we're looking at OSes we do have control over, not others.) And this cheat is getting better (for the seller) with every expansion: 1 MiB is 5% bigger than 1 MB 1 GiB is 7% bigger than 1 GB 1 TiB is 10% bigger than 1 TB So when you go out to buy your 1TB drive this year, you're really only buying 0.9TiB or so. Since all the manufacturers do the same thing, it's possible to consider it "fair", at least for comparisons -- but when the customer gets home and formats their drive, I think they'd be happier if the number was the same as on the carton. Just last night I formatted some new "500GB" drives, and they eventually came back with 465GB as the displayed capacity. Wouldn't it make more sense to display that as "465GiB"? David> Talk about a cure worse than the disease! So you're saying that David> 256MB flash cards could be advertised as having 268.4MB? A David> 512MB RAM stick is mislabelled and could correctly say 536.8MB? David> That's just plain craziness. No, it sounds like he wants them advertised as 256MB and 512MiB, respectively -- packaged flash cards tend to use the 1000 multiples, while DRAM uses the 1024. One extra letter doesn't sound all that crazy. How fast is your Ethernet port? 100Mbps or 95.37Mbps? Somewhat archaic now, but how big was your common 3.5" floppy disk (PC "HD" format)? It actually used a basis of 1,024,000: And if two definitions of the megabyte are not enough, a third megabyte of 1 024 000 bytes is the megabyte used to format the familiar 90 mm (3 1/2 inch), "1.44 MB" diskette. -- http://physics.nist.gov/cuu/Units/binary.html What's likely is that the flash and drive manufacturers will either mark their products honestly, or they'll increase the capacity of their product to meet the given label. (Think about the CRT "diagonal" measurements -- it took a few lawsuits, but they eventually switched from measuring bezel-to-bezel, or total tube diagonal, to "viewable". Sure, everyone in technology "knew" that you had to chop off an inch or two from the advertised value to get the viewable; but that's not enough to meet the standard of truth in advertising.) David> Adopting IEC 60027-2 just replaces a set of well-understood David> problems with all new problems. Which are clearly solved in the standards document, and remove any ambiguity. Is one extra character really that painful to you? t. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Weird XFS slowness
Hi; After switching ext3 to xfs, i realize system starts to _really_ unresponsive and extracting tarballs, copying or deleting files or checking out svn repositories are really slow, so i basically try to measure some for both xfs and ext3 with same computer, same kernel (2.6.18.6), same disk, here are the results * between all tests i dropped caches * i already tried to change block device's scheduler to as, noop and cfq, nothing really changes * i already tried 2.6.20-rc5 and 2.6.20-rc5.1.rt8.0085 which Ingo provides but again nothing really changes Kernel Tarball -- a) XFS [EMAIL PROTECTED] ~ $ time tar xvf linux-2.6.19.tar.bz2 ... real2m16.865s user0m21.113s sys 0m2.426s b) EXT3 [EMAIL PROTECTED] ~ $ time tar xvf linux-2.6.19.tar.bz2 ... real0m34.192s user0m20.624s sys 0m1.771s Deletion a) XFS [EMAIL PROTECTED] ~ $ time rm -rf linux-2.6.19/ real0m50.902s user0m0.064s sys 0m1.378s b) EXT3 [EMAIL PROTECTED] ~ $ time rm -rf linux-2.6.19/ real0m1.162s user0m0.031s sys 0m0.411s Copying --- a) XFS [EMAIL PROTECTED] test $ time cp -r ../linux-2.6.19 . ... real1m42.833s user0m0.124s sys 0m2.621s b) EXT3 [EMAIL PROTECTED] test $ time cp -r ../linux-2.6.19 . ... real0m38.456s user0m0.166s sys 0m2.744s I'm not sure these are normal numbers or its a regression (i'm just starting to use XFS) so any hints will be appreciated. Cheers -- S.Çağlar Onur <[EMAIL PROTECTED]> http://cekirdek.pardus.org.tr/~caglar/ Linux is like living in a teepee. No Windows, no Gates and an Apache in house! pgpbyqId4aU9T.pgp Description: PGP signature
pata_sil680 module, udev and changing drive node order
Hi all, I am using kernel 2.6.19 with the new pata and sata drivers. First of all, the drivers work great, no crashes nothing. There is one downside i found by using these drivers, and i am not sure how i can fix this. The drivers load correctly but my drives seem to be in a different order all the time, which is not very convinient when your run md devices. I have a pata_via driver, which is built-in to the kernel since it serves my primary and secundary ATA controller. I have a pata_pdc2027x driver, serving the 3rd and 4th ATA controller on the motherboard. (as module) I have a pata_sil680 driver serving 2 PCI add-in cards (as module) I have a sata_sil driver for the onboard sata controller. (as module) What seems to happen is that either the modules are auto-loaded and that the pata_sil680 driver changes the order of the two PCI cards every reboot or that udev gets different events from different controllers as first after every reboot and therefor creates the device nodes different. So, my question is: how do I force a fixed order for a module handling two PCI cards, or how do I tell udev to always use the same mapping for the device nodes in /dev? Thanks, - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH query] arm: i.MX/MX1 clock event source
Hello Thomas, Sascha and Ingo please can you find some time to review next patch arm: i.MX/MX1 clock event source which has been sent to you and to the ALKML at 2007-01-13. http://thread.gmane.org/gmane.linux.ports.arm.kernel/29510/focus=29533 There seems to be some problems, because this patch has not been accepted to patch-2.6.20-rc5-rt7.patch, but GENERIC_CLOCKEVENTS are set already for i.MX and this results in a problems to run RT kernel on this architecture. config ARCH_IMX bool "IMX" + select GENERIC_TIME + select GENERIC_CLOCKEVENTS Thanks for review and your time Pavel - Subject: arm: i.MX/MX1 clock event source Support clock event source based on i.MX general purpose timer in free running timer mode. Signed-off-by: Pavel Pisa <[EMAIL PROTECTED]> arch/arm/mach-imx/time.c | 92 +++ 1 file changed, 92 insertions(+) Index: linux-2.6.20-rc4/arch/arm/mach-imx/time.c === --- linux-2.6.20-rc4.orig/arch/arm/mach-imx/time.c +++ linux-2.6.20-rc4/arch/arm/mach-imx/time.c @@ -15,6 +15,9 @@ #include #include #include +#ifdef CONFIG_GENERIC_CLOCKEVENTS +#include +#endif #include #include @@ -25,6 +28,11 @@ /* Use timer 1 as system timer */ #define TIMER_BASE IMX_TIM1_BASE +#ifdef CONFIG_GENERIC_CLOCKEVENTS +static struct clock_event_device clockevent_imx; +static enum clock_event_mode clockevent_mode = CLOCK_EVT_PERIODIC; +#endif + static unsigned long evt_diff; /* @@ -42,9 +50,16 @@ imx_timer_interrupt(int irq, void *dev_i if (tstat & TSTAT_COMP) { do { +#ifdef CONFIG_GENERIC_CLOCKEVENTS + if (clockevent_imx.event_handler) + clockevent_imx.event_handler(); + if (likely(clockevent_mode != CLOCK_EVT_PERIODIC)) + break; +#else write_seqlock(_lock); timer_tick(); write_sequnlock(_lock); +#endif IMX_TCMP(TIMER_BASE) += evt_diff; } while (unlikely((int32_t)(IMX_TCMP(TIMER_BASE) @@ -99,11 +114,88 @@ static int __init imx_clocksource_init(v return 0; } +#ifdef CONFIG_GENERIC_CLOCKEVENTS + +static void imx_set_next_event(unsigned long evt, + struct clock_event_device *unused) +{ + evt_diff = evt; + IMX_TCMP(TIMER_BASE) = IMX_TCN(TIMER_BASE) + evt; +} + +static void imx_set_mode(enum clock_event_mode mode, struct clock_event_device *evt) +{ + unsigned long flags; + + /* +* The timer interrupt generation is disabled at least +* for enough time to call imx_set_next_event() +*/ + local_irq_save(flags); + /* Disable interrupt in GPT module */ + IMX_TCTL(TIMER_BASE) &= ~TCTL_IRQEN; + if ((mode != CLOCK_EVT_PERIODIC) || (mode != clockevent_mode)) { + /* Set event time into far-far future */ + IMX_TCMP(TIMER_BASE) = IMX_TCN(TIMER_BASE) - 3; + /* Clear pending interrupt */ + IMX_TSTAT(TIMER_BASE) &= ~TSTAT_COMP; + } + /* Remember timer mode */ + clockevent_mode = mode; + local_irq_restore(flags); + + switch (mode) { + case CLOCK_EVT_PERIODIC: + case CLOCK_EVT_ONESHOT: + /* +* Do not put overhead of interrupt enable/disable into +* imx_set_next_event(), the core has about 4 minutes +* to call imx_set_next_event() or shutdown clock after +* mode switching +*/ + local_irq_save(flags); + IMX_TCTL(TIMER_BASE) |= TCTL_IRQEN; + local_irq_restore(flags); + break; + case CLOCK_EVT_SHUTDOWN: + /* Left event sources disabled, no more interrupts appears */ + break; + } +} + +static struct clock_event_device clockevent_imx = { + .name = "imx_timer1", + .capabilities = CLOCK_CAP_NEXTEVT | CLOCK_CAP_TICK | + CLOCK_CAP_UPDATE | CLOCK_CAP_PROFILE, + .shift = 32, + .set_mode = imx_set_mode, + .set_next_event = imx_set_next_event, +}; + +static int __init imx_clockevent_init(void) +{ + clockevent_imx.mult = div_sc(imx_get_perclk1(), NSEC_PER_SEC, + clockevent_imx.shift); + clockevent_imx.max_delta_ns = + clockevent_delta2ns(0xfffe, _imx); + clockevent_imx.min_delta_ns = + clockevent_delta2ns(0xf, _imx); + register_local_clockevent(_imx); + + return 0; +} +#endif + + static void __init imx_timer_init(void) { imx_timer_hardware_init();
Re: How to use an usb interface than is claimed by HID?
On Sun, 21 Jan 2007, Ivan Ukhov wrote: > No, it won't do. Imagine that I'm not able to modify the kernel with its > drivers. Could I ask you what precisely is the driver you are talking about doing? Why is it not going to be a part of mainline kernel (i.e. being able to be put on blacklist easily). > It should work with usual kernel and HID driver. So I want my driver to > ask the HID driver to free the interfaces or don't claim them at all. Mb > there's an example of such a driver?.. obviously there are a lot of HID > devices and mb a vendor one of them doesn't want to use HID driver for > one of its interfaces to provide some additional features or something, > so he should make the kernel use his driver instead of HID... Sure, there are such in-kernel drivers ... for example Wacom driver. This driver is in-kernel, and it is hooked inside the usb_hid_configure() function to be ignored by the HID layer completely, and all the driver specific handling is handled in drivers/usb/input/wacom*. (When looking at that code, it looks quite ugly by the way. I have no idea why wacom driver is not using HID_QUIRK_IGNORE, but has a hardcoded hook in the usb_hid_configure() instead. I will probably fix this.) Thanks, -- Jiri Kosina - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to use an usb interface than is claimed by HID?
Jiri Kosina wrote: On Sat, 20 Jan 2007, Ivan Ukhov wrote: I'm writing a driver for an USB device that has one configuration with several interfacies and one of them is a HID interface. So when I check this interface whether it's claimed (usb_interface_claimed), I find out that it is, and it's claimed by the HID driver. So here is the question: how can I ask the HID driver to unclaim this very interface for me so that I can use it? The HID driver is needed for some other devices, so I can't just rmmod it. Hi Ivan, if I understand correctly what you need, wouldn't setting the HID_QUIRK_IGNORE for a given tuple of idVendor and idProduct be enough? (see hid_blacklist[] in drivers/usb/input/hid-core.c). No, it won't do. Imagine that I'm not able to modify the kernel with its drivers. It should work with usual kernel and HID driver. So I want my driver to ask the HID driver to free the interfaces or don't claim them at all. Mb there's an example of such a driver?.. obviously there are a lot of HID devices and mb a vendor one of them doesn't want to use HID driver for one of its interfaces to provide some additional features or something, so he should make the kernel use his driver instead of HID... Does it make any sense?) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On 1/21/07, Tim Schmielau <[EMAIL PROTECTED]> wrote: Yes. You have a faster Disk that writes about 45 MB/s. But I am not sure I understand what you want to know? I got these results with a customized 2.6.20-rc5. [EMAIL PROTECTED] kernel]$ uname -a Linux Typhoon 2.6.20-rc5-Topol-M #1 SMP Sun Jan 21 04:35:28 IST 2007 i686 i686 i386 GNU/Linux > [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time > sync > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 19.5007 seconds, 55.1 MB/s > > real0m20.439s > user0m0.004s > sys 0m4.535s > > real0m4.625s > user0m0.000s > sys 0m0.125s [EMAIL PROTECTED] kernel]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 22.7749 seconds, 47.1 MB/s real0m24.541s user0m0.005s sys 0m3.899s real0m0.000s user0m0.000s sys 0m0.000s > > [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 20.8707 seconds, 51.4 MB/s > > real0m22.449s > user0m0.002s > sys 0m4.922s [EMAIL PROTECTED] kernel]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 19.8685 seconds, 54.0 MB/s real0m21.373s user0m0.003s sys 0m3.859s > Linux used here is not 2.6.20-rc5, but it's a FC6 2.6.19 binary. Shall > post the results with 2.6.20-rc5. > > BTW, does the results vary with a customized kernel (configured w.r.t > Processor & Hardware) than a generic kernel like FC6? I'd guess the kernel won't make much of a difference as the time is mostly determined by RAM and disk speeds. There is some deviation in the results between these 2 kernels. Is this acceptable? > Are there any other such test cases? Well, what do you want to find out? Anyways, I am in no way expert in the field of benchmarking. I would be trying to benchmark the results on my machines in this fashion (overclocking experiment):- Disk Types Machine RAM SATA 1.5 GBPS - 160 GB P4-HT-3.0 GHz 2x1GB Corsair SATA 3.0 GBPS - 320 GB P4-HT-3.0 GHz 2x1GB Corsair SATA 1.5 GBPS - 160 GB P4-HT-3.0 GHz 2x1GB OCZ SATA 3.0 GBPS - 320 GB P4-HT-3.0 GHz 2x1GB OCZ SATA 1.5 GBPS - 160 GB P4-HT-3.0 GHz 2x1GB Supertalent SATA 3.0 GBPS - 320 GB P4-HT-3.0 GHz 2x1GB Supertalent SATA 1.5 GBPS - 160 GB P4-HT-3.0 GHz 2x1GB Hynix SATA 3.0 GBPS - 320 GBP4-HT-3.0 GHz 2x1GB Hynix Boards here would be used are Intel based 915, 965, and 975. Would be happy to know more test cases - for RAM/Disk/Processor Frequency And, I don't work for any magazine, writing a review ;-) Tim Thanks, ~Akula2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to use an usb interface than is claimed by HID?
On Sat, 20 Jan 2007, Ivan Ukhov wrote: > I'm writing a driver for an USB device that has one configuration with > several interfacies and one of them is a HID interface. So when I check > this interface whether it's claimed (usb_interface_claimed), I find out > that it is, and it's claimed by the HID driver. So here is the question: > how can I ask the HID driver to unclaim this very interface for me so > that I can use it? The HID driver is needed for some other devices, so I > can't just rmmod it. Hi Ivan, if I understand correctly what you need, wouldn't setting the HID_QUIRK_IGNORE for a given tuple of idVendor and idProduct be enough? (see hid_blacklist[] in drivers/usb/input/hid-core.c). -- Jiri Kosina - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Undo some of the pseudo-security madness
Samium Gromoff wrote: >This patch removes the dropping of ADDR_NO_RANDOMIZE upon execution of setuid >binaries. > >Why? The answer consists of two parts: > >Firstly, there are valid applications which need an unadulterated memory map. >Some of those which do their memory management, like lisp systems (like SBCL). >They try to achieve this by setting ADDR_NO_RANDOMIZE and reexecuting >themselves. > >Secondly, there also are valid reasons to want those applications to be setuid >root. Like poking hardware. This has the unfortunate side-effect of making it easier for local attackers to mount privilege escalation attacks against setuid binaries -- even those setuid binaries that don't need unadulterated memory maps. There's a cleaner solution to the problem case you mentioned. Rather than re-exec()ing itself, the application could be split into two executables: the first is a tiny setuid-root wrapper which sets ADDR_NO_RANDOMIZE and then executes the second program; the second is not setuid-anything and does all the real work. Such a decomposition is often better for security for other reasons, too (such as the fact that the wrapper can drop all unneeded privileges before exec()ing the second executable). Why would you need an entire lisp system to be setuid root? That sounds like a really bad idea. I fail to see why that is a relevant example. Perhaps the fact that such a lisp system breaks if you have security features enabled should tell you something. It may be possible to defeat address space randomization in some cases, but that doesn't mean address space randomization is worthless. It sounds like there is a tradeoff between security and backwards compatibility. I don't claim to know how to choose between those tradeoffs, but I think one ought to at least be aware of the pros and cons on both sides. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: O_DIRECT question
On Saturday 20 January 2007 21:55, Michael Tokarev wrote: > Denis Vlasenko wrote: > > On Thursday 11 January 2007 18:13, Michael Tokarev wrote: > >> example, which isn't quite possible now from userspace. But as long as > >> O_DIRECT actually writes data before returning from write() call (as it > >> seems to be the case at least with a normal filesystem on a real block > >> device - I don't touch corner cases like nfs here), it's pretty much > >> THE ideal solution, at least from the application (developer) standpoint. > > > > Why do you want to wait while 100 megs of data are being written? > > You _have to_ have threaded db code in order to not waste > > gobs of CPU time on UP + even with that you eat context switch > > penalty anyway. > > Usually it's done using aio ;) > > It's not that simple really. > > For reads, you have to wait for the data anyway before doing something > with it. Omiting reads for now. Really? All 100 megs _at once_? Linus described fairly simple (conceptually) idea here: http://lkml.org/lkml/2002/5/11/58 In short, page-aligned read buffer can be just unmapped, with page fault handler catching accesses to yet-unread data. As data comes from disk, it gets mapped back in process' address space. This way read() returns almost immediately and CPU is free to do something useful. > For writes, it's not that problematic - even 10-15 threads is nothing > compared with the I/O (O in this case) itself -- that context switch > penalty. Well, if you have some CPU intensive thing to do (e.g. sort), why not benefit from lack of extra context switch? Assume that we have "clever writes" like Linus described. /* something like "caching i/o over this fd is mostly useless" */ /* (looks like this API is easier to transition to * than fadvise etc. - it's "looks like" O_DIRECT) */ fd = open(..., flags|O_STREAM); ... /* Starts writeout immediately due to O_STREAM, * marks buf100meg's pages R/O to catch modifications, * but doesn't block! */ write(fd, buf100meg, 100*1024*1024); /* We are free to do something useful in parallel */ sort(); > > I hope you agree that threaded code is not ideal performance-wise > > - async IO is better. O_DIRECT is strictly sync IO. > > Hmm.. Now I'm confused. > > For example, oracle uses aio + O_DIRECT. It seems to be working... ;) > As an alternative, there are multiple single-threaded db_writer processes. > Why do you say O_DIRECT is strictly sync? I mean that O_DIRECT write() blocks until I/O really is done. Normal write can block for much less, or not at all. > In either case - I provided some real numbers in this thread before. > Yes, O_DIRECT has its problems, even security problems. But the thing > is - it is working, and working WAY better - from the performance point > of view - than "indirect" I/O, and currently there's no alternative that > works as good as O_DIRECT. Why we bothered to write Linux at all? There were other Unixes which worked ok. -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG] Sleeping function called from invalid context (2.6.18.6)
Hi, I have been testing my wireless zd1211rw driver with kismet, but have noticed my logs filling up with these messages instead: BUG: sleeping function called from invalid context at kernel/mutex.c:86 in_atomic():0, irqs_disabled():1 [] mutex_lock+0x12/0x1a [] netdev_run_todo+0x10/0x1f1 [] dev_ioctl+0x465/0x497 [] d_rehash+0x47/0x78 [] sock_attach_fd+0x6c/0xcb [] sock_ioctl+0x0/0x1b3 [] do_ioctl+0x1c/0x5d [] vfs_ioctl+0x241/0x254 [] sys_ioctl+0x2c/0x43 [] syscall_call+0x7/0xb According to the comment in linux/net/core/dev.c * 2) Since we run with the RTNL semaphore not held, we can sleep *safely in order to wait for the netdev refcnt to drop to zero. Well, apparently we can't sleep safely after all because IRQs have been disabled. Cheers, Chris ___ New Yahoo! Mail is the ultimate force in competitive emailing. Find out more at the Yahoo! Mail Championships. Plus: play games and win prizes. http://uk.rd.yahoo.com/evt=44106/*http://mail.yahoo.net/uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible ways of dealing with OOM conditions.
Evgeniy Polyakov wrote: On Fri, Jan 19, 2007 at 01:53:15PM +0100, Peter Zijlstra ([EMAIL PROTECTED]) wrote: Even further development of such idea is to prevent such OOM condition at all - by starting swapping early (but wisely) and reduce memory usage. These just postpone execution but will not avoid it. No. If system allows to have such a condition, then something is broken. It must be prevented, instead of creating special hacks to recover from it. Evgeniy, you may want to learn something about the VM before stating that reality should not occur. Due to the way everything in the kernel works, you cannot prevent the memory allocator from allocating everything and running out, except maybe by setting aside reserves to deal with special subsystems. As for your "swapping early and reduce memory usage", that is just not possible in a system where a memory writeout may need one or more memory allocations to succeed and other I/O paths (eg. file writes) can take memory from the same pools. With something like iscsi it may be _necessary_ for file writes and swap to take memory from the same pools, because they can share the same block device. Please get out of your fantasy world and accept the constraints the VM has to operate under. Maybe then you and Peter can agree on something. -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: PROBLEM: KB->KiB, MB -> MiB, ... (IEC 60027-2)
> Nice observation, however, it still leaves quite an amount of internal > inconsistencies in the kernel output. I agree with the majority view that using the term 'MB' or 'GB' to mean a million or a billion bytes is inaccurate. The way RAM and flash are measured is correct. The way disk manufacturers advertise disk capacity is simply *wrong*. There is no word for a million bytes. There is no word for a billion bytes. > One way of getting rid of those inconsistencies would be to follow IEC > 60027-2 for those cases where SI is inappropriate. Talk about a cure worse than the disease! So you're saying that 256MB flash cards could be advertised as having 268.4MB? A 512MB RAM stick is mislabelled and could correctly say 536.8MB? That's just plain craziness. Adopting IEC 60027-2 just replaces a set of well-understood problems with all new problems. DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[-mm patch] oops in drivers/net/shaper.c
Hi, The following code: [...] s = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)); socket_address.sll_family = PF_PACKET; socket_address.sll_protocol = htons(ETH_P_IP); /* * this happens to be shaper0 on my system */ => socket_address.sll_ifindex = 2; socket_address.sll_hatype = ARPHRD_ETHER; socket_address.sll_pkttype = PACKET_OTHERHOST; socket_address.sll_halen = ETH_ALEN; socket_address.sll_addr[0] = 0x00; socket_address.sll_addr[1] = 0x04; socket_address.sll_addr[2] = 0x75; socket_address.sll_addr[3] = 0xC8; socket_address.sll_addr[4] = 0x28; socket_address.sll_addr[5] = 0xE5; socket_address.sll_addr[6] = 0x00; socket_address.sll_addr[7] = 0x00; memcpy((void *) buffer, (void *) dest_mac, ETH_ALEN); memcpy((void *) (buffer + ETH_ALEN), (void *) src_mac, ETH_ALEN); eh->h_proto = 0x00; for (j = 0; j < 1500; j++) { data[j] = (unsigned char) ((int) (255.0 * rand() / (RAND_MAX + 1.0))); } /* * Oopses here */ => send_result = sendto(s, buffer, 1499, 0, (struct sockaddr *) _address, sizeof(socket_address)); [...] Causes the following oops: [ 66.355049] BUG: unable to handle kernel NULL pointer dereference at virtual address [ 66.355053] printing eip: [ 66.355055] [ 66.355056] *pde = [ 66.355059] Oops: [#1] [ 66.355061] PREEMPT SMP DEBUG_PAGEALLOC [ 66.355065] last sysfs file: /devices/pci:00/:00:1e.2/modalias [ 66.355069] Modules linked in: snd_pcm_oss snd_mixer_oss snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device af_packet ohci_hcd fuse cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative speedstep_centrino freq_table processor ac battery i915 drm usb_storage parport_pc parport sr_mod serio_raw yenta_socket rsrc_nonstatic pcmcia_core ipw2200 tg3 snd_intel8x0 snd_ac97_codec pcspkr ac97_bus snd_pcm ehci_hcd snd_timer snd soundcore snd_page_alloc uhci_hcd usbcore shpchp pci_hotplug joydev evdev tsdev [ 66.355115] CPU:0 [ 66.355116] EIP:0060:[<>]Not tainted VLI [ 66.355117] EFLAGS: 00210282 (2.6.20-rc4-mm1-def01 #2) [ 66.355122] EIP is at 0x0 [ 66.355124] eax: f6a1f480 ebx: f705a500 ecx: 0800 edx: [ 66.355128] esi: f6a1f480 edi: 0800 ebp: f6261d90 esp: f6261d70 [ 66.355131] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 [ 66.355134] Process aze (pid: 11005, ti=f626 task=f62d08b0 task.ti=f626) [ 66.355136] Stack: c0294465 f6261ec0 05db f705a000 f6261f34 f6316380 f705a000 [ 66.355145]f6261dc4 f8adaf03 f6261ec0 05db f6b1b400 f6a1f480 00080e38 [ 66.355153]f6261ec0 ffea f8adc1e0 05db f6316380 f6261eac c030e1c5 05db [ 66.355161] Call Trace: [ 66.355163] [] show_trace_log_lvl+0x1a/0x30 [ 66.355171] [] show_stack_log_lvl+0xa9/0xd5 [ 66.355176] [] show_registers+0x1f9/0x362 [ 66.355180] [] die+0x12c/0x261 [ 66.355184] [] do_page_fault+0x2ef/0x5d0 [ 66.355188] [] error_code+0x7c/0x84 [ 66.355192] [] packet_sendmsg+0x147/0x201 [af_packet] [ 66.355199] [] sock_sendmsg+0xf9/0x116 [ 66.355204] [] sys_sendto+0xbf/0xe0 [ 66.355208] [] sys_socketcall+0x1aa/0x277 [ 66.355212] [] sysenter_past_esp+0x5f/0x99 [ 66.355216] === [ 66.355218] Code: Bad EIP value. [ 66.355223] EIP: [<>] 0x0 SS:ESP 0068:f6261d70 shaper_header() should check for shaper->dev not being NULL (ie. the shaper was actually attached) as in the following patch. This happens in mainline too (tested 2.6.19.2). Regards, Frederik Signed-off-by: Frederik Deweerdt <[EMAIL PROTECTED]> diff --git a/drivers/net/shaper.c b/drivers/net/shaper.c index e886e8d..40e9e27 100644 --- a/drivers/net/shaper.c +++ b/drivers/net/shaper.c @@ -340,6 +340,10 @@ static int shaper_header(struct sk_buff *skb, struct net_device *dev, { struct shaper *sh=dev->priv; int v; + + if(sh->dev==NULL) + return -ENODEV; + if(sh_debug) printk("Shaper header\n"); skb->dev=sh->dev; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
On Saturday, 20. January 2007 20:59, you wrote: > Ian Kumlien wrote: > > Hi, > > > > I went from 2.6.19+sata_nv-adma-ncq-v7.patch, with no problems and adama > > enabled, to 2.6.20-rc5, which gave me problems almost instantly. > > > > I just thought that it might be interesting to know that it DID work > > nicely. > > > > CC since i'm not on the ml > > (I'm ccing more of the people who reported this) > > Well that's interesting.. The only significant change that went into > 2.6.20-rc5 in that driver that wasn't in that version you mentioned was > this one: > > http://www2.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=com >mit;h=2dec7555e6bf2772749113ea0ad454fcdb8cf861 > > Could you (or anyone else) test what happens if you take the 2.6.20-rc5 > version of sata_nv.c and try it on 2.6.19? That would tell us whether > it's this change or whether it's something else (i.e. in libata core). Ok, did that! (got a fresh 2.6.19 tar ball, and used 2.6.20-rc5' sata_nv.c with the oneliner in libata_sff.c) And surprise after one hour uptime, there is not even one sata exceptions in dmesg! (I'll report back tomorrow...) > > Assuming that still doesn't work, can you then try removing these lines > from nv_host_intr in 2.6.20-rc5 sata_nv.c and see what that does? > > /* bail out if not our interrupt */ > if (!(irq_stat & NV_INT_DEV)) > return 0; > > as that's the difference I'm most suspicious of causing the problem. Linux version 2.6.19test ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #2 SMP PREEMPT Sat Jan 20 22:19:20 CET 2007 Command line: root=/dev/md1 ro BIOS-provided physical RAM map: BIOS-e820: - 0009f800 (usable) BIOS-e820: 0009f800 - 000a (reserved) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 7fff (usable) BIOS-e820: 7fff - 7fff3000 (ACPI NVS) BIOS-e820: 7fff3000 - 8000 (ACPI data) BIOS-e820: e000 - f000 (reserved) BIOS-e820: fec0 - 0001 (reserved) Entering add_active_range(0, 0, 159) 0 entries of 256 used Entering add_active_range(0, 256, 524272) 1 entries of 256 used end_pfn_map = 1048576 DMI 2.3 present. ACPI: RSDP (v000 Nvidia) @ 0x000f7d30 ACPI: RSDT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 0x7fff3040 ACPI: FADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 0x7fff30c0 ACPI: SSDT (v001 PTLTD POWERNOW 0x0001 LTP 0x0001) @ 0x7fff9900 ACPI: SRAT (v001 AMDHAMMER 0x0001 AMD 0x0001) @ 0x7fff9b40 ACPI: MCFG (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 0x7fff9c40 ACPI: MADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 0x7fff9840 ACPI: DSDT (v001 NVIDIA AWRDACPI 0x1000 MSFT 0x010e) @ 0x Entering add_active_range(0, 0, 159) 0 entries of 256 used Entering add_active_range(0, 256, 524272) 1 entries of 256 used Zone PFN ranges: DMA 0 -> 4096 DMA324096 -> 1048576 Normal1048576 -> 1048576 early_node_map[2] active PFN ranges 0:0 -> 159 0: 256 -> 524272 On node 0 totalpages: 524175 DMA zone: 56 pages used for memmap DMA zone: 10 pages reserved DMA zone: 3933 pages, LIFO batch:0 DMA32 zone: 7111 pages used for memmap DMA32 zone: 513065 pages, LIFO batch:31 Normal zone: 0 pages used for memmap Nvidia board detected. Ignoring ACPI timer override. If you got timer trouble try acpi_use_timer_override ACPI: PM-Timer IO Port: 0x4008 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 (Bootup-CPU) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) Processor #1 ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 2, address 0xfec0, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: BIOS IRQ0 pin2 override ignored. ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge) ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge) ACPI: IRQ9 used by override. ACPI: IRQ14 used by override. ACPI: IRQ15 used by override. Setting APIC routing to physical flat Using ACPI (MADT) for SMP configuration information Nosave address range: 0009f000 - 000a Nosave address range: 000a - 000f Nosave address range: 000f - 0010 Allocating PCI resources starting at 8800 (gap: 8000:6000) SMP: Allowing 2 CPUs, 0 hotplug CPUs PERCPU: Allocating 32320 bytes of per cpu data Built 1 zonelists. Total pages: 516998 Kernel command line: root=/dev/md1 ro Initializing CPU#0 PID hash table
Re: SATA exceptions with 2.6.20-rc5
On lör, 2007-01-20 at 21:43 +, Alistair John Strachan wrote: > On Saturday 20 January 2007 19:59, Robert Hancock wrote: > > Ian Kumlien wrote: > > > Hi, > > > > > > I went from 2.6.19+sata_nv-adma-ncq-v7.patch, with no problems and adama > > > enabled, to 2.6.20-rc5, which gave me problems almost instantly. > > > > > > I just thought that it might be interesting to know that it DID work > > > nicely. > > > > > > CC since i'm not on the ml > > > > (I'm ccing more of the people who reported this) > > > > Well that's interesting.. The only significant change that went into > > 2.6.20-rc5 in that driver that wasn't in that version you mentioned was > > this one: > > > > http://www2.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=com > >mit;h=2dec7555e6bf2772749113ea0ad454fcdb8cf861 > > > > Could you (or anyone else) test what happens if you take the 2.6.20-rc5 > > version of sata_nv.c and try it on 2.6.19? That would tell us whether > > it's this change or whether it's something else (i.e. in libata core). > > I'm still running an -rc5 kernel with ADMA switched off entirely and I can't > reproduce the problem. How is everybody else reproducing this? > > I've been successful installing bonnie++, then going to a large XFS partition > and running "bonnie++ -u 1000:1000" and letting it run through, all defaults. > > It doesn't cause the problem I was seeing in -rc5 with ADMA on, when I switch > ADMA off, so I think this is sufficient to fix it. Eh? The whole point with that patch was to ADD ADMA support to sata_nv, imho that is something we want to have and i have been running with ADMA on on two computers since sata_nv-adma-ncq-v4 or 5 or so without problems. So, something has been introduced or been broken to cause this error, wouldn't it be better to find the error introduced than to just totally negate the patch in the first place? I haven't had the energy to go trough the patch that was found as causing the problem yet... I don't know if i even have all the info needed to make any form of educated guess but i'll give it a try when i have the energy. I really home someone finds it before then =) -- Ian Kumlien -- http://pomac.netswarm.net signature.asc Description: This is a digitally signed message part
Re: [-mm patch] fs/unionfs/: possible cleanups
On Thu, Jan 18, 2007 at 10:55:54PM +0100, Adrian Bunk wrote: > Let's start with a small exercise: > > Consider sparse tells you about a global function: > warning: symbol 'unionfs_d_revalidate_wrap' was not declared. Should > it be static? I ran sparse last week, and cleaned up a few things (not commited to korg yet). I'll use your patch instead. > This patch contains the following possible cleanups: > - every function should #include the headers containing the prototypes > of it's global functions > - static functions in C files shouldn't be marked "inline", gcc should > know best when to inline them > - make needlessly global code static > - #if 0 the following unused global function: > - stale_inode.c: is_stale_inode() > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Thanks. Josef "Jeff" Sipek. -- NT is to UNIX what a doughnut is to a particle accelerator. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Sun, 21 Jan 2007, Sunil Naidu wrote: > On 1/21/07, Tim Schmielau <[EMAIL PROTECTED]> wrote: > > > > Note that these dd "benchmarks" are completely bogus, because the data > > doesn't actually get written to disk in that time. For some enlightening > > data, try > > > > time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time sync > > > > The dd returns as soon as all data could be buffered in RAM. Only sync > > will show how long it takes to actually write out the data to disk. > > also explains why you see better results is writeout starts earlier. > > I am still getting better I feel: Yes. You have a faster Disk that writes about 45 MB/s. But I am not sure I understand what you want to know? > [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time > sync > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 19.5007 seconds, 55.1 MB/s > > real0m20.439s > user0m0.004s > sys 0m4.535s > > real0m4.625s > user0m0.000s > sys 0m0.125s > > > [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 20.8707 seconds, 51.4 MB/s > > real0m22.449s > user0m0.002s > sys 0m4.922s > > > Linux used here is not 2.6.20-rc5, but it's a FC6 2.6.19 binary. Shall > post the results with 2.6.20-rc5. > > BTW, does the results vary with a customized kernel (configured w.r.t > Processor & Hardware) than a generic kernel like FC6? I'd guess the kernel won't make much of a difference as the time is mostly determined by RAM and disk speeds. > Are there any other such test cases? Well, what do you want to find out? Anyways, I am in no way expert in the field of benchmarking. Note to Willy: I finally noticed my logic actually was not flawed. I stated why dd would report approximately doubled throughputs with buffering, while you argued why the total elapsed time would not change much. Time to go to bed now... Tim - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
On Saturday 20 January 2007 19:59, Robert Hancock wrote: > Ian Kumlien wrote: > > Hi, > > > > I went from 2.6.19+sata_nv-adma-ncq-v7.patch, with no problems and adama > > enabled, to 2.6.20-rc5, which gave me problems almost instantly. > > > > I just thought that it might be interesting to know that it DID work > > nicely. > > > > CC since i'm not on the ml > > (I'm ccing more of the people who reported this) > > Well that's interesting.. The only significant change that went into > 2.6.20-rc5 in that driver that wasn't in that version you mentioned was > this one: > > http://www2.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=com >mit;h=2dec7555e6bf2772749113ea0ad454fcdb8cf861 > > Could you (or anyone else) test what happens if you take the 2.6.20-rc5 > version of sata_nv.c and try it on 2.6.19? That would tell us whether > it's this change or whether it's something else (i.e. in libata core). I'm still running an -rc5 kernel with ADMA switched off entirely and I can't reproduce the problem. How is everybody else reproducing this? I've been successful installing bonnie++, then going to a large XFS partition and running "bonnie++ -u 1000:1000" and letting it run through, all defaults. It doesn't cause the problem I was seeing in -rc5 with ADMA on, when I switch ADMA off, so I think this is sufficient to fix it. Others have reported differently. Did you guys do: [EMAIL PROTECTED]:~$ cat /proc/cmdline root=/dev/sda1 ro sata_nv.adma=0 Or something similar? This is how Jeff suggested disabling ADMA and indeed the messages about its use disappear from dmesg. -- Cheers, Alistair. Final year Computer Science undergraduate. 1F2 55 South Clerk Street, Edinburgh, UK. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Sat, 20 Jan 2007, Willy Tarreau wrote: > On Sat, Jan 20, 2007 at 09:39:25PM +0100, Tim Schmielau wrote: > > On Sat, 20 Jan 2007, Willy Tarreau wrote: > > > On Sat, Jan 20, 2007 at 09:10:22PM +0100, Tim Schmielau wrote: > > > > > > > also explains why you see better results is writeout starts earlier. > > > > > > The results should be better when writeout starts later since most of > > > the transfer will have been performed at RAM speed. That's what happens > > > with the user above with 2 GB RAM. But in case of the VAIO with 512 MB, > > > there's really something strange IMHO. I suspect it has two RAM areas, > > > one fast and one slow (probably one two large non-cacheable area for a > > > shared video or such a crap, which can be avoided when reducing the > > > cache size). > > > > No - the earlier the writeout starts, the earlier he will have enough free > > RAM to finish the dd command by buffering the remaining data. > > OK I see your point. While trying to show why I got you wrong, I in fact > demonstrated to myself that you were right :-) > > For instance, let's say we have 500 MB cache at 1000 MB/s and a write out > threshold of 80% with a disk at 100 MB/s. Writing 1000 MB would produce > this pattern : > > time data sent writtendirty data > in secfrom ddto disk in cache >0.00 MB0 MB0 MB >0.4 400 MB0 MB 400 MB -> writeout starts >1.0 560 MB 60 MB 500 MB >5.4 1000 MB 500 MB 500 MB -> dd leaves. > 10.4 1000 MB 1000 MB0 MB -> write terminated. > > -> avg dd speed = 1000/5.4 = 185 MB/s >avg disk speed = 1000/10.4 = 96 MB/s > > > Now with a lower writeout threshold of 2% (10 MB) : > > time data sent writtendirty data > in secfrom ddto disk in cache >0.00 MB0 MB0 MB >0.01 10 MB0 MB 10 MB -> writeout starts >1.0 599 MB 99 MB 500 MB > 5.01 1000 MB 500 MB 500 MB -> dd leaves. > 10.01 1000 MB 1000 MB0 MB -> write terminated. > > -> avg dd speed = 1000/5.01 = 199.6 MB/s >avg disk speed = 1000/10.01 = 99.9 MB/s > > At least, numbers are not that much different to justify a one to two speed > ratio on the VAIO. The difference being caused by cache speed, it's clearly > possible that his RAM is definitely very very slow which would then explain > the difference. > > > > Note that we did not cap the amount of buffers, just started to write out > > earlier. > > > Indeed, that's what makes the whole difference. I was used to cap the amount > of buffers, but the behaviour here is different. > > Thanks for your insight ! Thanks for being so humble in pointing out my logic is flawed. While the Vaio certainly cannot write 1000GB/s to its RAM, it's disk is also quite slow and the ratio of 10:1 for RAM:disk speed is presumably correct. So we don't quite understand why dd in RAM is so slow for him. Thanks, Tim - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Sat, Jan 20, 2007 at 09:39:25PM +0100, Tim Schmielau wrote: > On Sat, 20 Jan 2007, Willy Tarreau wrote: > > On Sat, Jan 20, 2007 at 09:10:22PM +0100, Tim Schmielau wrote: > > > > > > Note that these dd "benchmarks" are completely bogus, because the data > > > doesn't actually get written to disk in that time. For some enlightening > > > data, try > > > > > > time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time sync > > > > > > The dd returns as soon as all data could be buffered in RAM. Only sync > > > will show how long it takes to actually write out the data to disk. > > > > While I 100% agree with you on this, I'd like to note that I don't agree > > with the following statement : > > > > > also explains why you see better results is writeout starts earlier. > > > > The results should be better when writeout starts later since most of > > the transfer will have been performed at RAM speed. That's what happens > > with the user above with 2 GB RAM. But in case of the VAIO with 512 MB, > > there's really something strange IMHO. I suspect it has two RAM areas, > > one fast and one slow (probably one two large non-cacheable area for a > > shared video or such a crap, which can be avoided when reducing the > > cache size). > > No - the earlier the writeout starts, the earlier he will have enough free > RAM to finish the dd command by buffering the remaining data. OK I see your point. While trying to show why I got you wrong, I in fact demonstrated to myself that you were right :-) For instance, let's say we have 500 MB cache at 1000 MB/s and a write out threshold of 80% with a disk at 100 MB/s. Writing 1000 MB would produce this pattern : time data sent writtendirty data in secfrom ddto disk in cache 0.00 MB0 MB0 MB 0.4 400 MB0 MB 400 MB -> writeout starts 1.0 560 MB 60 MB 500 MB 5.4 1000 MB 500 MB 500 MB -> dd leaves. 10.4 1000 MB 1000 MB0 MB -> write terminated. -> avg dd speed = 1000/5.4 = 185 MB/s avg disk speed = 1000/10.4 = 96 MB/s Now with a lower writeout threshold of 2% (10 MB) : time data sent writtendirty data in secfrom ddto disk in cache 0.00 MB0 MB0 MB 0.01 10 MB0 MB 10 MB -> writeout starts 1.0 599 MB 99 MB 500 MB 5.01 1000 MB 500 MB 500 MB -> dd leaves. 10.01 1000 MB 1000 MB0 MB -> write terminated. -> avg dd speed = 1000/5.01 = 199.6 MB/s avg disk speed = 1000/10.01 = 99.9 MB/s At least, numbers are not that much different to justify a one to two speed ratio on the VAIO. The difference being caused by cache speed, it's clearly possible that his RAM is definitely very very slow which would then explain the difference. > Note that we did not cap the amount of buffers, just started to write out > earlier. Indeed, that's what makes the whole difference. I was used to cap the amount of buffers, but the behaviour here is different. Thanks for your insight ! Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible
On Sat, 20 Jan 2007, Justin Piszcz wrote: > > > On Sat, 20 Jan 2007, Avuton Olrich wrote: > > > On 1/20/07, Justin Piszcz <[EMAIL PROTECTED]> wrote: > > > Perhaps its time to back to a stable (2.6.17.13 kernel)? > > > > > > Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 > > > partition, the OOM killer goes into effect and kills almost all my > > > processes. > > > > > > Completely 100% reproducible. > > > > > > Does 2.6.19.2 have some of memory allocation bug as well? > > > > I had been seeing something similar (also with 2.6.19.2), but it's not > > outputting anything to dmesg, so I was waiting for something to happen > > before I reported it. It's mostly the same thing, but I've only seen > > it happen when copying something large (2+ GB) over NFS. Interactivity > > completely goes away and lockups last 10-15 seconds a piece. Then > > realized I turned the swap off, so I turned it on and didn't lockup > > any longer. > > -- > > avuton > > -- > > Anyone who quotes me in their sig is an idiot. -- Rusty Russell. > > > > > > My swap is on, 2GB ram and 2GB of swap on this machine. I can't go back > to 2.6.17.13 as it does not recognize the NICs in my machine correctly and > the Alsa Intel HD Audio driver has bugs etc, I guess I am stuck with > 2.6.19.2 :( > > Justin. > > The weird part is nothing shows high memory usage in top or via ps, the kernel just freaks and kill -9's almost all of my processes. Nasty VM bug? Justin. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On 1/21/07, Tim Schmielau <[EMAIL PROTECTED]> wrote: Note that these dd "benchmarks" are completely bogus, because the data doesn't actually get written to disk in that time. For some enlightening data, try time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time sync The dd returns as soon as all data could be buffered in RAM. Only sync will show how long it takes to actually write out the data to disk. also explains why you see better results is writeout starts earlier. I am still getting better I feel: [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time sync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 19.5007 seconds, 55.1 MB/s real0m20.439s user0m0.004s sys 0m4.535s real0m4.625s user0m0.000s sys 0m0.125s [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 20.8707 seconds, 51.4 MB/s real0m22.449s user0m0.002s sys 0m4.922s Linux used here is not 2.6.20-rc5, but it's a FC6 2.6.19 binary. Shall post the results with 2.6.20-rc5. BTW, does the results vary with a customized kernel (configured w.r.t Processor & Hardware) than a generic kernel like FC6? Are there any other such test cases? Tim Thanks, ~Akula2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible
On Sat, 20 Jan 2007, Avuton Olrich wrote: > On 1/20/07, Justin Piszcz <[EMAIL PROTECTED]> wrote: > > Perhaps its time to back to a stable (2.6.17.13 kernel)? > > > > Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 > > partition, the OOM killer goes into effect and kills almost all my > > processes. > > > > Completely 100% reproducible. > > > > Does 2.6.19.2 have some of memory allocation bug as well? > > I had been seeing something similar (also with 2.6.19.2), but it's not > outputting anything to dmesg, so I was waiting for something to happen > before I reported it. It's mostly the same thing, but I've only seen > it happen when copying something large (2+ GB) over NFS. Interactivity > completely goes away and lockups last 10-15 seconds a piece. Then > realized I turned the swap off, so I turned it on and didn't lockup > any longer. > -- > avuton > -- > Anyone who quotes me in their sig is an idiot. -- Rusty Russell. > > My swap is on, 2GB ram and 2GB of swap on this machine. I can't go back to 2.6.17.13 as it does not recognize the NICs in my machine correctly and the Alsa Intel HD Audio driver has bugs etc, I guess I am stuck with 2.6.19.2 :( Justin. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 11/15] ide: make ide_hwif_t.ide_dma_{host_off,off_quietly} void
Hello again. :-) Bartlomiej Zolnierkiewicz wrote: [PATCH] ide: make ide_hwif_t.ide_dma_{host_off,off_quietly} void Below are my nits on the patch itself, and the code it changes. Index: b/drivers/ide/pci/atiixp.c === --- a/drivers/ide/pci/atiixp.c +++ b/drivers/ide/pci/atiixp.c @@ -121,7 +121,7 @@ static int atiixp_ide_dma_host_on(ide_dr return __ide_dma_host_on(drive); } -static int atiixp_ide_dma_host_off(ide_drive_t *drive) +static void atiixp_ide_dma_host_off(ide_drive_t *drive) { struct pci_dev *dev = drive->hwif->pci_dev; unsigned long flags; [...] @@ -306,7 +306,7 @@ static void __devinit init_hwif_atiixp(i hwif->udma_four = 0; hwif->ide_dma_host_on = _ide_dma_host_on; - hwif->ide_dma_host_off = _ide_dma_host_off; + hwif->dma_host_off = _ide_dma_host_off; hwif->ide_dma_check = _dma_check; if (!noautodma) hwif->autodma = 1; Would seem logical to get rid of ide_ in the function's name also... Index: b/drivers/ide/pci/sgiioc4.c === --- a/drivers/ide/pci/sgiioc4.c +++ b/drivers/ide/pci/sgiioc4.c @@ -282,12 +282,11 @@ sgiioc4_ide_dma_on(ide_drive_t * drive) return HWIF(drive)->ide_dma_host_on(drive); } -static int -sgiioc4_ide_dma_off_quietly(ide_drive_t * drive) +static void sgiioc4_ide_dma_off_quietly(ide_drive_t *drive) { drive->using_dma = 0; - return HWIF(drive)->ide_dma_host_off(drive); + drive->hwif->dma_host_off(drive); } static int sgiioc4_ide_dma_check(ide_drive_t *drive) @@ -317,12 +316,9 @@ sgiioc4_ide_dma_host_on(ide_drive_t * dr return 1; } -static int -sgiioc4_ide_dma_host_off(ide_drive_t * drive) +static void sgiioc4_ide_dma_host_off(ide_drive_t * drive) { sgiioc4_clearirq(drive); - - return 0; } static int @@ -612,10 +608,10 @@ ide_init_sgiioc4(ide_hwif_t * hwif) hwif->ide_dma_end = _ide_dma_end; hwif->ide_dma_check = _ide_dma_check; hwif->ide_dma_on = _ide_dma_on; - hwif->ide_dma_off_quietly = _ide_dma_off_quietly; + hwif->dma_off_quietly = _ide_dma_off_quietly; hwif->ide_dma_test_irq = _ide_dma_test_irq; hwif->ide_dma_host_on = _ide_dma_host_on; - hwif->ide_dma_host_off = _ide_dma_host_off; + hwif->dma_host_off = _ide_dma_host_off; hwif->ide_dma_lostirq = _ide_dma_lostirq; hwif->ide_dma_timeout = &__ide_dma_timeout; The same here... Index: b/drivers/ide/pci/sl82c105.c === --- a/drivers/ide/pci/sl82c105.c +++ b/drivers/ide/pci/sl82c105.c @@ -261,26 +261,24 @@ static int sl82c105_ide_dma_on (ide_driv if (config_for_dma(drive)) { config_for_pio(drive, 4, 0, 0); Ugh, this forces PIO4 on fallback... and dma_on() doesn't select any modes in any other driver but this one. :-/ - return HWIF(drive)->ide_dma_off_quietly(drive); + drive->hwif->dma_off_quietly(drive); + return 0; } printk(KERN_INFO "%s: DMA enabled\n", drive->name); return __ide_dma_on(drive); } -static int sl82c105_ide_dma_off_quietly (ide_drive_t *drive) +static void sl82c105_ide_dma_off_quietly(ide_drive_t *drive) Also worth renaming... { u8 speed = XFER_PIO_0; - int rc; - + DBG(("sl82c105_ide_dma_off_quietly(drive:%s)\n", drive->name)); - rc = __ide_dma_off_quietly(drive); + ide_dma_off_quietly(drive); if (drive->pio_speed) Should always be non-zero since explicitly initialized. speed = drive->pio_speed - XFER_PIO_0; config_for_pio(drive, speed, 0, 1); drive->current_speed = drive->pio_speed; dma_off() shouldn't be changing current_speed IMHO. - - return rc; } The patch to fix those two functions is also cooking... MBR, Sergei - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: O_DIRECT question
Denis Vlasenko wrote: > On Thursday 11 January 2007 18:13, Michael Tokarev wrote: >> example, which isn't quite possible now from userspace. But as long as >> O_DIRECT actually writes data before returning from write() call (as it >> seems to be the case at least with a normal filesystem on a real block >> device - I don't touch corner cases like nfs here), it's pretty much >> THE ideal solution, at least from the application (developer) standpoint. > > Why do you want to wait while 100 megs of data are being written? > You _have to_ have threaded db code in order to not waste > gobs of CPU time on UP + even with that you eat context switch > penalty anyway. Usually it's done using aio ;) It's not that simple really. For reads, you have to wait for the data anyway before doing something with it. Omiting reads for now. For writes, it's not that problematic - even 10-15 threads is nothing compared with the I/O (O in this case) itself -- that context switch penalty. > I hope you agree that threaded code is not ideal performance-wise > - async IO is better. O_DIRECT is strictly sync IO. Hmm.. Now I'm confused. For example, oracle uses aio + O_DIRECT. It seems to be working... ;) As an alternative, there are multiple single-threaded db_writer processes. Why do you say O_DIRECT is strictly sync? In either case - I provided some real numbers in this thread before. Yes, O_DIRECT has its problems, even security problems. But the thing is - it is working, and working WAY better - from the performance point of view - than "indirect" I/O, and currently there's no alternative that works as good as O_DIRECT. Thanks. /mjt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 12/15] ide: make ide_hwif_t.ide_dma_host_on void
Hello again. :-) Bartlomiej Zolnierkiewicz wrote: [PATCH] ide: make ide_hwif_t.ide_dma_host_on void * since ide_hwif_t.ide_dma_host_on is called either when drive->using_dma == 1 or when return value is discarded make it void, also drop "ide_" prefix * make __ide_dma_host_on() void and drop "__" prefix Below are some nits which also apply to the previous patch... Index: b/drivers/ide/pci/atiixp.c === --- a/drivers/ide/pci/atiixp.c +++ b/drivers/ide/pci/atiixp.c @@ -101,7 +101,7 @@ static u8 atiixp_dma_2_pio(u8 xfer_rate) } } -static int atiixp_ide_dma_host_on(ide_drive_t *drive) +static void atiixp_ide_dma_host_on(ide_drive_t *drive) { Would seem logical to get rid of ide_ in this function's name also... struct pci_dev *dev = drive->hwif->pci_dev; unsigned long flags; [...] Index: b/drivers/ide/pci/sgiioc4.c === --- a/drivers/ide/pci/sgiioc4.c +++ b/drivers/ide/pci/sgiioc4.c [...] @@ -307,13 +307,8 @@ sgiioc4_ide_dma_test_irq(ide_drive_t * d return sgiioc4_checkirq(HWIF(drive)); } -static int -sgiioc4_ide_dma_host_on(ide_drive_t * drive) +static void sgiioc4_ide_dma_host_on(ide_drive_t * drive) Same comment here... { - if (drive->using_dma) - return 0; - - return 1; } static void sgiioc4_ide_dma_host_off(ide_drive_t * drive) @@ -610,7 +605,7 @@ ide_init_sgiioc4(ide_hwif_t * hwif) hwif->ide_dma_on = _ide_dma_on; hwif->dma_off_quietly = _ide_dma_off_quietly; hwif->ide_dma_test_irq = _ide_dma_test_irq; - hwif->ide_dma_host_on = _ide_dma_host_on; + hwif->dma_host_on = _ide_dma_host_on; hwif->dma_host_off = _ide_dma_host_off; hwif->ide_dma_lostirq = _ide_dma_lostirq; hwif->ide_dma_timeout = &__ide_dma_timeout; Unrelated note: not sure why this default value needs explicit assignemnt... MBR, Sergei - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible
On 1/20/07, Justin Piszcz <[EMAIL PROTECTED]> wrote: Perhaps its time to back to a stable (2.6.17.13 kernel)? Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 partition, the OOM killer goes into effect and kills almost all my processes. Completely 100% reproducible. Does 2.6.19.2 have some of memory allocation bug as well? I had been seeing something similar (also with 2.6.19.2), but it's not outputting anything to dmesg, so I was waiting for something to happen before I reported it. It's mostly the same thing, but I've only seen it happen when copying something large (2+ GB) over NFS. Interactivity completely goes away and lockups last 10-15 seconds a piece. Then realized I turned the swap off, so I turned it on and didn't lockup any longer. -- avuton -- Anyone who quotes me in their sig is an idiot. -- Rusty Russell. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -rt] whitespace cleanup for 2.6.20-rc5-rt7
fixes trailing whitespace and spaces before tab indents in 2.6.20-rc5-rt7 as reported with: git-apply --whitespace=error-all --- arch/arm/mach-omap1/time.c|2 +- arch/i386/kernel/entry.S |2 +- arch/i386/kernel/reboot.c |2 +- arch/ia64/Kconfig |4 ++-- arch/ia64/kernel/fsys.S |2 +- arch/ia64/kernel/time.c |4 ++-- arch/x86_64/kernel/entry.S|2 +- arch/x86_64/kernel/hpet.c |2 +- arch/x86_64/kernel/reboot.c |2 +- arch/x86_64/kernel/smpboot.c |8 arch/x86_64/kernel/tsc.c |8 arch/x86_64/kernel/vsyscall.c |2 +- drivers/char/lpptest.c|2 +- include/asm-arm/atomic.h |2 +- include/asm-generic/vmlinux.lds.h |4 ++-- include/linux/irq.h |6 +++--- include/linux/mutex.h |2 +- kernel/fork.c |6 +++--- kernel/futex.c|4 ++-- kernel/hrtimer.c |2 +- kernel/latency_trace.c|6 +++--- kernel/spinlock.c |2 +- kernel/workqueue.c|2 +- mm/migrate.c |2 +- mm/slab.c | 14 +++--- 25 files changed, 47 insertions(+), 47 deletions(-) diff --git a/arch/arm/mach-omap1/time.c b/arch/arm/mach-omap1/time.c index 7baf0df..85f9f5e 100644 --- a/arch/arm/mach-omap1/time.c +++ b/arch/arm/mach-omap1/time.c @@ -105,7 +105,7 @@ static inline unsigned long omap_mpu_tim static inline void omap_mpu_set_autoreset(int nr) { - volatile omap_mpu_timer_regs_t* timer = omap_mpu_timer_base(nr); + volatile omap_mpu_timer_regs_t* timer = omap_mpu_timer_base(nr); timer->cntl = timer->cntl | MPU_TIMER_AR; } diff --git a/arch/i386/kernel/entry.S b/arch/i386/kernel/entry.S index ce8a092..33d6b80 100644 --- a/arch/i386/kernel/entry.S +++ b/arch/i386/kernel/entry.S @@ -272,7 +272,7 @@ need_resched: jz restore_nocheck testl $IF_MASK,PT_EFLAGS(%esp) # interrupts off (exception path) ? jz restore_nocheck - DISABLE_INTERRUPTS(CLBR_ANY) + DISABLE_INTERRUPTS(CLBR_ANY) call preempt_schedule_irq jmp need_resched diff --git a/arch/i386/kernel/reboot.c b/arch/i386/kernel/reboot.c index 5f81e80..a3e7410 100644 --- a/arch/i386/kernel/reboot.c +++ b/arch/i386/kernel/reboot.c @@ -327,7 +327,7 @@ void machine_emergency_restart(void) asm volatile ( "1: .byte 0x0f, 0x01, 0xc4 \n" /* vmxoff */ "2: \n" - ".section __ex_table,\"a\" \n" + ".section __ex_table,\"a\" \n" " .align 4\n" " .long 1b,2b \n" ".previous \n" diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 333049c..a1a6bc3 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -300,7 +300,7 @@ config HIGH_RES_RESOLUTION choice prompt "Clock source" depends on HIGH_RES_TIMERS - default HIGH_RES_TIMER_ITC + default HIGH_RES_TIMER_ITC help This option allows you to choose the hardware source in charge of generating high precision interruptions on your system. @@ -308,7 +308,7 @@ choice ITC Interval Time Counter 1/CPU clock - HPET High Precision Event Timer ~ (XXX:have to check the spec) + HPET High Precision Event Timer ~ (XXX:have to check the spec) The ITC timer is available on all the ia64 computers because it is integrated directly into the processor. However it may not diff --git a/arch/ia64/kernel/fsys.S b/arch/ia64/kernel/fsys.S index 7b94c34..2abf920 100644 --- a/arch/ia64/kernel/fsys.S +++ b/arch/ia64/kernel/fsys.S @@ -219,7 +219,7 @@ ENTRY(fsys_gettimeofday) (p6)br.cond.spnt.few .fail_einval // deferred branch ;; ld8 r30 = [r10] // clocksource->mmio_ptr - movl r19 = itc_lastcycle + movl r19 = itc_lastcycle add r23 = IA64_CLOCKSOURCE_SHIFT_OFFSET,r20 cmp.ne p6, p0 = 0, r2 // Fallback if work is scheduled (p6)br.cond.spnt.many fsys_fallback_syscall diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c index 983ef26..4fc3670 100644 --- a/arch/ia64/kernel/time.c +++ b/arch/ia64/kernel/time.c @@ -262,10 +262,10 @@ ia64_init_itm (void) ia64_cpu_local_tick(); if (!clocksource_itc_p) { - /* Sort out mult/shift values: */ + /* Sort out mult/shift values: */ clocksource_itc.mult = clocksource_hz2mult(local_cpu_data->itc_freq, clocksource_itc.shift); -
Re: Abysmal disk performance, how to debug?
On Sat, 20 Jan 2007, Willy Tarreau wrote: > On Sat, Jan 20, 2007 at 09:10:22PM +0100, Tim Schmielau wrote: > > > > Note that these dd "benchmarks" are completely bogus, because the data > > doesn't actually get written to disk in that time. For some enlightening > > data, try > > > > time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time sync > > > > The dd returns as soon as all data could be buffered in RAM. Only sync > > will show how long it takes to actually write out the data to disk. > > While I 100% agree with you on this, I'd like to note that I don't agree > with the following statement : > > > also explains why you see better results is writeout starts earlier. > > The results should be better when writeout starts later since most of > the transfer will have been performed at RAM speed. That's what happens > with the user above with 2 GB RAM. But in case of the VAIO with 512 MB, > there's really something strange IMHO. I suspect it has two RAM areas, > one fast and one slow (probably one two large non-cacheable area for a > shared video or such a crap, which can be avoided when reducing the > cache size). No - the earlier the writeout starts, the earlier he will have enough free RAM to finish the dd command by buffering the remaining data. Note that we did not cap the amount of buffers, just started to write out earlier. Tim - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Sat, Jan 20, 2007 at 09:28:57PM +0100, Tim Schmielau wrote: > On Sat, 20 Jan 2007, Willy Tarreau wrote: > > > Anyway, in your situation with a very small buffer, this should not > > change by more than half a second or so. > > Well, his buffer is not small. He has half a GB of RAM, so when > writing 1 GB the buffer would roughly double the dd speed, exactly as he > has shown us. yes, but he sees the opposite : when using half of the memory for the buffer, he has low speed. When shorting cache size to 2% only, he doubles his speed. > Anyways, instead of always just posting similar answers to yours, I'll > have dinner now. :-) :-) Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Sat, 20 Jan 2007, Willy Tarreau wrote: > Anyway, in your situation with a very small buffer, this should not > change by more than half a second or so. Well, his buffer is not small. He has half a GB of RAM, so when writing 1 GB the buffer would roughly double the dd speed, exactly as he has shown us. Anyways, instead of always just posting similar answers to yours, I'll have dinner now. :-) Tim - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
Hi Tim, On Sat, Jan 20, 2007 at 09:10:22PM +0100, Tim Schmielau wrote: > On Sat, 20 Jan 2007, Ismail Dönmez wrote: > > > 20 Oca 2007 Cts 19:45 tarihinde ??unlar?? yazmt??n??z: > > [...] > > > > vaio cartman # hdparm -tT /dev/hda > > > > > > > > /dev/hda: > > > > Timing cached reads: 1576 MB in 2.00 seconds = 788.18 MB/sec > > > > Timing buffered disk reads: 74 MB in 3.01 seconds = 24.55 MB/sec > > > > > > > > > > > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > > > > 1024+0 records in > > > > 1024+0 records out > > > > 1073741824 bytes (1,1 GB) copied, 77,2809 s, 13,9 MB/s > > > > > > > > real1m17.482s > > > > user0m0.003s > > > > sys 0m2.350s > > > > > > That's not bad at all ! I suspect that if your system becomes > > > unresponsive, > > > it's because real writes start when the cache is full. And if you fill > > > 512 MB of RAM with data that you then need to flush on disk at 14 MB/s, it > > > can take about 40 seconds during which it might be difficult to do > > > anything. > > > > > > Try lowering the cache flush starting point to about 10 MB if you want > > > (2% of 512 MB) : > > > > > > # echo 2 >/proc/sys/vm/dirty_ratio > > > # echo 2 >/proc/sys/vm/dirty_background_ratio > > > > After that I get, > > > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes (1,1 GB) copied, 41,7005 s, 25,7 MB/s > > > > real0m41.926s > > user0m0.007s > > sys 0m2.500s > > > > > > not bad! thanks :) > > Note that these dd "benchmarks" are completely bogus, because the data > doesn't actually get written to disk in that time. For some enlightening > data, try > > time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time sync > > The dd returns as soon as all data could be buffered in RAM. Only sync > will show how long it takes to actually write out the data to disk. While I 100% agree with you on this, I'd like to note that I don't agree with the following statement : > also explains why you see better results is writeout starts earlier. The results should be better when writeout starts later since most of the transfer will have been performed at RAM speed. That's what happens with the user above with 2 GB RAM. But in case of the VAIO with 512 MB, there's really something strange IMHO. I suspect it has two RAM areas, one fast and one slow (probably one two large non-cacheable area for a shared video or such a crap, which can be avoided when reducing the cache size). Best regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 10/15] ide: add ide_set_dma() helper
Hello again. :-) Bartlomiej Zolnierkiewicz wrote: [PATCH] ide: add ide_set_dma() helper * add ide_set_dma() helper and make ide_hwif_t.ide_dma_check return -1 when DMA needs to be disabled (== need to call ->ide_dma_off_quietly) 0 when DMA needs to be enabled (== need to call ->ide_dma_on) 1 when DMA setting shouldn't be changed * fix IDE code to use ide_set_dma() instead if using ->ide_dma_check directly Here are a few comments related to the code being patched: Index: b/drivers/ide/pci/alim15x3.c === --- a/drivers/ide/pci/alim15x3.c +++ b/drivers/ide/pci/alim15x3.c @@ -507,17 +507,15 @@ static int config_chipset_for_dma (ide_d * * Configure a drive for DMA operation. If DMA is not possible we * drop the drive into PIO mode instead. - * - * FIXME: exactly what are we trying to return here */ - + static int ali15x3_config_drive_for_dma(ide_drive_t *drive) { ide_hwif_t *hwif= HWIF(drive); struct hd_driveid *id = drive->id; if ((m5229_revision<=0x20) && (drive->media!=ide_disk)) - return hwif->ide_dma_off_quietly(drive); + goto no_dma_set; Isn't it better to just return -1? @@ -552,9 +550,10 @@ try_dma_modes: ata_pio: hwif->tuneproc(drive, 255); no_dma_set: - return hwif->ide_dma_off_quietly(drive); + return -1; } - return hwif->ide_dma_on(drive); + + return 0; } Ugh, this code looks like it's asking to be converted into calling ide_use_dma(). instead all of that... Index: b/drivers/ide/pci/cs5520.c === --- a/drivers/ide/pci/cs5520.c +++ b/drivers/ide/pci/cs5520.c @@ -132,12 +132,11 @@ static void cs5520_tune_drive(ide_drive_ static int cs5520_config_drive_xfer_rate(ide_drive_t *drive) { - ide_hwif_t *hwif = HWIF(drive); - /* Tune the drive for PIO modes up to PIO 4 */ cs5520_tune_drive(drive, 4); Ugh. Why not ask drive? :-/ /* Then tell the core to use DMA operations */ - return hwif->ide_dma_on(drive); + return 0; That must be the famous VDMA thing... :-) I wonder if it *actually* works on HPT36x/37x which seem to have support for it... Index: b/drivers/ide/pci/jmicron.c === --- a/drivers/ide/pci/jmicron.c +++ b/drivers/ide/pci/jmicron.c @@ -164,14 +164,12 @@ static int config_chipset_for_dma (ide_d static int jmicron_config_drive_for_dma (ide_drive_t *drive) { - ide_hwif_t *hwif= drive->hwif; + if (ide_use_dma(drive) && config_chipset_for_dma(drive)) + return 0; - if (ide_use_dma(drive)) { - if (config_chipset_for_dma(drive)) - return hwif->ide_dma_on(drive); - } config_jmicron_chipset_for_pio(drive, 1); The 2nd argument of that funtion is useless -- it basically does nothing if 0 is passed. Another case of mindless copy-paste. :-) Index: b/drivers/ide/pci/pdc202xx_old.c === --- a/drivers/ide/pci/pdc202xx_old.c +++ b/drivers/ide/pci/pdc202xx_old.c @@ -332,17 +332,15 @@ chipset_is_set: static int pdc202xx_config_drive_xfer_rate (ide_drive_t *drive) { - ide_hwif_t *hwif= HWIF(drive); - drive->init_speed = 0; if (ide_use_dma(drive) && config_chipset_for_dma(drive)) - return hwif->ide_dma_on(drive); + return 0; if (ide_use_fast_pio(drive)) - hwif->tuneproc(drive, 5); + config_chipset_for_pio(drive, 5); That part is obsoleted by my recent fix... Index: b/drivers/ide/pci/piix.c === --- a/drivers/ide/pci/piix.c +++ b/drivers/ide/pci/piix.c @@ -386,20 +386,18 @@ static int piix_config_drive_for_dma (id static int piix_config_drive_xfer_rate (ide_drive_t *drive) { - ide_hwif_t *hwif= HWIF(drive); - drive->init_speed = 0; if (ide_use_dma(drive) && piix_config_drive_for_dma(drive)) - return hwif->ide_dma_on(drive); + return 0; if (ide_use_fast_pio(drive)) { /* Find best PIO mode. */ - (void) hwif->speedproc(drive, XFER_PIO_0 + - ide_get_best_pio_mode(drive, 255, 4, NULL)); + u8 pio = ide_get_best_pio_mode(drive, 255, 4, NULL); + (void)piix_tune_chipset(drive, XFER_PIO_0 + pio); } Will try to fix the tuneproc() nuisance RSN. :-) Index: b/drivers/ide/pci/serverworks.c === --- a/drivers/ide/pci/serverworks.c +++ b/drivers/ide/pci/serverworks.c @@ -315,17 +315,15 @@ static int config_chipset_for_dma (ide_d static int
Re: Abysmal disk performance, how to debug?
On Sat, 20 Jan 2007, Ismail Dönmez wrote: > 20 Oca 2007 Cts 22:10 tarihinde, Tim Schmielau şunları yazmıştı: > > > > Note that these dd "benchmarks" are completely bogus, because the data=20 > > doesn't actually get written to disk in that time. For some enlightening=20 > > data, try > > > > time dd if=3D/dev/zero of=3D/tmp/1GB bs=3D1M count=3D1024; time sync > > > > The dd returns as soon as all data could be buffered in RAM. Only sync=20 > > will show how long it takes to actually write out the data to disk. > > also explains why you see better results is writeout starts earlier. > > Still not that bad: > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024;sync > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1,1 GB) copied, 53,3194 s, 20,1 MB/s > > real0m53.517s > user0m0.003s > sys 0m3.193s > That's not the point, you still measured the same as before (but you might have noticed that, after printing the results, the shell prompt took some time to appear). I appended "time sync" to the command to show that (depending on the amount of available memory) actually most of the time is spent in the "sync", not the "dd". Tim
[2.6 patch] remove the broken FB_S3TRIO driver
The FB_S3TRIO driver: - has been marked as BROKEN for more than two years and - is still marked as BROKEN. Drivers that had been marked as BROKEN for such a long time seem to be unlikely to be revived in the forseeable future. But if anyone wants to ever revive this driver, the code is still present in the older kernel releases. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Acked-by: Geert Uytterhoeven <[EMAIL PROTECTED]> --- This patch was already sent on: - 4 Jan 2007 drivers/video/Kconfig|6 drivers/video/Makefile |1 drivers/video/S3triofb.c | 790 --- include/video/s3blit.h | 79 --- 4 files changed, 876 deletions(-) --- linux-2.6.20-rc2-mm1/drivers/video/Kconfig.old 2007-01-04 19:37:20.0 +0100 +++ linux-2.6.20-rc2-mm1/drivers/video/Kconfig 2007-01-04 19:37:31.0 +0100 @@ -1113,12 +1113,6 @@ help Say Y here if you want to control the backlight of your display. -config FB_S3TRIO - bool "S3 Trio display support" - depends on (FB = y) && PPC && BROKEN - help - If you have a S3 Trio say Y. Say N for S3 Virge. - config FB_S3 tristate "S3 Trio/Virge support" depends on FB && PCI --- linux-2.6.20-rc2-mm1/drivers/video/Makefile.old 2007-01-04 19:37:46.0 +0100 +++ linux-2.6.20-rc2-mm1/drivers/video/Makefile 2007-01-04 19:37:50.0 +0100 @@ -48,7 +48,6 @@ obj-$(CONFIG_FB_VALKYRIE) += valkyriefb.o obj-$(CONFIG_FB_CT65550) += chipsfb.o obj-$(CONFIG_FB_IMSTT)+= imsttfb.o -obj-$(CONFIG_FB_S3TRIO) += S3triofb.o obj-$(CONFIG_FB_FM2) += fm2fb.o obj-$(CONFIG_FB_CYBLA)+= cyblafb.o obj-$(CONFIG_FB_TRIDENT) += tridentfb.o --- linux-2.6.20-rc2-mm1/include/video/s3blit.h 2006-11-29 22:57:37.0 +0100 +++ /dev/null 2006-09-19 00:45:31.0 +0200 @@ -1,79 +0,0 @@ -#ifndef _VIDEO_S3BLIT_H -#define _VIDEO_S3BLIT_H - -/* s3 commands */ -#define S3_BITBLT 0xc011 -#define S3_TWOPOINTLINE 0x2811 -#define S3_FILLEDRECT 0x40b1 - -#define S3_FIFO_EMPTY 0x0400 -#define S3_HDW_BUSY 0x0200 - -/* Enhanced register mapping (MMIO mode) */ - -#define S3_READ_SEL 0xbee8 /* offset f */ -#define S3_MULT_MISC 0xbee8 /* offset e */ -#define S3_ERR_TERM 0x92e8 -#define S3_FRGD_COLOR0xa6e8 -#define S3_BKGD_COLOR0xa2e8 -#define S3_PIXEL_CNTL0xbee8 /* offset a */ -#define S3_FRGD_MIX 0xbae8 -#define S3_BKGD_MIX 0xb6e8 -#define S3_CUR_Y 0x82e8 -#define S3_CUR_X 0x86e8 -#define S3_DESTY_AXSTP 0x8ae8 -#define S3_DESTX_DIASTP 0x8ee8 -#define S3_MIN_AXIS_PCNT 0xbee8 /* offset 0 */ -#define S3_MAJ_AXIS_PCNT 0x96e8 -#define S3_CMD 0x9ae8 -#define S3_GP_STAT 0x9ae8 -#define S3_ADVFUNC_CNTL 0x4ae8 -#define S3_WRT_MASK 0xaae8 -#define S3_RD_MASK 0xaee8 - -/* Enhanced register mapping (Packed MMIO mode, write only) */ -#define S3_ALT_CURXY 0x8100 -#define S3_ALT_CURXY20x8104 -#define S3_ALT_STEP 0x8108 -#define S3_ALT_STEP2 0x810c -#define S3_ALT_ERR 0x8110 -#define S3_ALT_CMD 0x8118 -#define S3_ALT_MIX 0x8134 -#define S3_ALT_PCNT 0x8148 -#define S3_ALT_PAT 0x8168 - -/* Drawing modes */ -#define S3_NOTCUR 0x -#define S3_LOGICALZERO 0x0001 -#define S3_LOGICALONE 0x0002 -#define S3_LEAVEASIS 0x0003 -#define S3_NOTNEW 0x0004 -#define S3_CURXORNEW 0x0005 -#define S3_NOT_CURXORNEW 0x0006 -#define S3_NEW 0x0007 -#define S3_NOTCURORNOTNEW 0x0008 -#define S3_CURORNOTNEW 0x0009 -#define S3_NOTCURORNEW 0x000a -#define S3_CURORNEW0x000b -#define S3_CURANDNEW 0x000c -#define S3_NOTCURANDNEW0x000d -#define S3_CURANDNOTNEW0x000e -#define S3_NOTCURANDNOTNEW 0x000f - -#define S3_CRTC_ADR0x03d4 -#define S3_CRTC_DATA 0x03d5 - -#define S3_REG_LOCK2 0x39 -#define S3_HGC_MODE 0x45 - -#define S3_HWGC_ORGX_H 0x46 -#define S3_HWGC_ORGX_L 0x47 -#define S3_HWGC_ORGY_H 0x48 -#define S3_HWGC_ORGY_L 0x49 -#define S3_HWGC_DX 0x4e -#define S3_HWGC_DY 0x4f - - -#define S3_LAW_CTL 0x58 - -#endif /* _VIDEO_S3BLIT_H */ --- linux-2.6.20-rc2-mm1/drivers/video/S3triofb.c 2006-12-28 11:57:35.0 +0100 +++ /dev/null 2006-09-19 00:45:31.0 +0200 @@ -1,790 +0,0 @@ -/* - * linux/drivers/video/S3Triofb.c -- Open Firmware based frame buffer device - * - * Copyright (C) 1997 Peter De Schrijver - * - * This driver is partly based on the PowerMac console driver: - * - * Copyright (C) 1996 Paul Mackerras - * - * and on the Open Firmware based frame buffer device: - * - * Copyright (C) 1997 Geert Uytterhoeven - * - * This file is subject to the terms and conditions of the GNU General Public - * License. See the file COPYING in the main directory of this archive for - * more details. - */ - -/* - Bugs : + OF dependencies should be removed. - + This
2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible
Perhaps its time to back to a stable (2.6.17.13 kernel)? Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 partition, the OOM killer goes into effect and kills almost all my processes. Completely 100% reproducible. Does 2.6.19.2 have some of memory allocation bug as well? System Events =-=-=-=-=-=-= Jan 20 15:13:13 p34 kernel: [ 8204.430917] top invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0 Jan 20 15:13:15 p34 kernel: [ 8204.430929] [] out_of_memory+0x189/0x1b6 Jan 20 15:13:15 p34 kernel: [ 8204.430939] [] __alloc_pages+0x27b/0x2db Jan 20 15:13:15 p34 kernel: [ 8204.430949] [] __get_free_pages+0x42/0x4d Jan 20 15:13:15 p34 kernel: [ 8204.430955] [] proc_file_read+0x72/0x28e Jan 20 15:13:15 p34 kernel: [ 8204.430962] [] _atomic_dec_and_lock+0x2d/0x54 Jan 20 15:13:15 p34 kernel: [ 8204.430968] [] mntput_no_expire+0x1c/0x7d Jan 20 15:13:15 p34 kernel: [ 8204.430975] [] vfs_read+0x9d/0x17b Jan 20 15:13:15 p34 kernel: [ 8204.430983] [] sys_read+0x4b/0x74 Jan 20 15:13:15 p34 kernel: [ 8204.430988] [] syscall_call+0x7/0xb Jan 20 15:13:15 p34 kernel: [ 8204.430994] === Jan 20 15:13:15 p34 kernel: [ 8204.430997] Mem-info: Jan 20 15:13:15 p34 kernel: [ 8204.430999] DMA per-cpu: Jan 20 15:13:15 p34 squid[2515]: Squid Parent: child process 2517 exited due to signal 9 Jan 20 15:13:15 p34 kernel: [ 8204.431003] CPU0: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 Jan 20 15:13:15 p34 kernel: [ 8204.431007] CPU1: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 Jan 20 15:13:15 p34 kernel: [ 8204.431010] Normal per-cpu: Jan 20 15:13:15 p34 kernel: [ 8204.431013] CPU0: Hot: hi: 186, btch: 31 usd: 166 Cold: hi: 62, btch: 15 usd: 50 Jan 20 15:13:15 p34 kernel: [ 8204.431017] CPU1: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 60 Jan 20 15:13:15 p34 kernel: [ 8204.431021] HighMem per-cpu: Jan 20 15:13:15 p34 kernel: [ 8204.431024] CPU0: Hot: hi: 186, btch: 31 usd: 13 Cold: hi: 62, btch: 15 usd: 3 Jan 20 15:13:15 p34 kernel: [ 8204.431028] CPU1: Hot: hi: 186, btch: 31 usd: 23 Cold: hi: 62, btch: 15 usd: 14 Jan 20 15:13:15 p34 kernel: [ 8204.431033] Active:8356 inactive:247241 dirty:58126 writeback:146669 unstable:0 free:41510 slab:61581 mapped:5339 pagetables:470 Jan 20 15:13:15 p34 kernel: [ 8204.431038] DMA free:3544kB min:68kB low:84kB high:100kB active:0kB inactive:0kB present:16256kB pages_scanned:0 all_unreclaimable? yes Jan 20 15:13:15 p34 kernel: [ 8204.431055] lowmem_reserve[]: 0 873 1991 Jan 20 15:13:15 p34 kernel: [ 8204.431064] Normal free:2788kB min:3744kB low:4680kB high:5616kB active:4kB inactive:30496kB present:894080kB pages_scanned:47944 all_unreclaimable? yes Jan 20 15:13:15 p34 kernel: [ 8204.431068] lowmem_reserve[]: 0 0 8945 Jan 20 15:13:15 p34 kernel: [ 8204.431077] HighMem free:159708kB min:512kB low:1708kB high:2908kB active:33420kB inactive:958468kB present:1145032kB pages_scanned:224 all_unreclaimable? no Jan 20 15:13:15 p34 kernel: [ 8204.431080] lowmem_reserve[]: 0 0 0 Jan 20 15:13:15 p34 kernel: [ 8204.431097] DMA: 0*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3544kB Jan 20 15:13:15 p34 kernel: [ 8204.431114] Normal: 1*4kB 0*8kB 0*16kB 3*32kB 0*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 2788kB Jan 20 15:13:15 p34 kernel: [ 8204.431132] HighMem: 1*4kB 1*8kB 3997*16kB 2194*32kB 353*64kB 17*128kB 3*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 159708kB Jan 20 15:13:15 p34 kernel: [ 8204.431150] Swap cache: add 2551, delete 2533, find 11/16, race 0+0 Jan 20 15:13:15 p34 kernel: [ 8204.431153] Free swap = 2190716kB Jan 20 15:13:15 p34 kernel: [ 8204.431155] Total swap = 2200760kB Jan 20 15:13:15 p34 kernel: [ 8204.431157] Free swap: 2190716kB Jan 20 15:13:15 p34 kernel: [ 8204.436274] 517888 pages of RAM Jan 20 15:13:15 p34 kernel: [ 8204.436279] 288512 pages of HIGHMEM Jan 20 15:13:15 p34 kernel: [ 8204.436281] 5662 reserved pages Jan 20 15:13:15 p34 kernel: [ 8204.436283] 246964 pages shared Jan 20 15:13:15 p34 kernel: [ 8204.436285] 18 pages swap cached Jan 20 15:13:15 p34 kernel: [ 8204.436288] 58126 pages dirty Jan 20 15:13:15 p34 kernel: [ 8204.436290] 146669 pages writeback Jan 20 15:13:15 p34 kernel: [ 8204.436292] 5339 pages mapped Jan 20 15:13:15 p34 kernel: [ 8204.436294] 61581 pages slab Jan 20 15:13:15 p34 kernel: [ 8204.436296] 470 pages pagetables Jan 20 15:13:15 p34 kernel: [ 8204.436391] Out of Memory: Kill process 1848 (named) score 12591 and children. Jan 20 15:13:15 p34 kernel: [ 8204.436395] Out of memory: Killed process 1848 (named). Jan 20 15:13:15 p34 kernel: [ 8204.437143] bash invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0 Jan 20 15:13:15 p34 kernel: [ 8204.437151] [] out_of_memory+0x189/0x1b6 Jan 20 15:13:15 p34 kernel: [ 8204.437160] [] __alloc_pages+0x27b/0x2db Jan 20 15:13:15 p34
Re: Abysmal disk performance, how to debug?
On Sat, Jan 20, 2007 at 10:16:15PM +0200, Ismail Dönmez wrote: > 20 Oca 2007 Cts 22:10 tarihinde, Tim Schmielau ??unlar?? yazmt??: > [...] > > > > Note that these dd "benchmarks" are completely bogus, because the data=20 > > doesn't actually get written to disk in that time. For some enlightening=20 > > data, try > > > > time dd if=3D/dev/zero of=3D/tmp/1GB bs=3D1M count=3D1024; time sync > > > > The dd returns as soon as all data could be buffered in RAM. Only sync=20 > > will show how long it takes to actually write out the data to disk. > > also explains why you see better results is writeout starts earlier. > > Still not that bad: > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024;sync > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1,1 GB) copied, 53,3194 s, 20,1 MB/s > > real0m53.517s > user0m0.003s > sys 0m3.193s No, your measure is wrong because time measures "dd" and sync is done after. Either use Tim's method (time sync) or the one I proposed in previous mail (time dd | sync). Anyway, in your situation with a very small buffer, this should not change by more than half a second or so. Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
20 Oca 2007 Cts 22:10 tarihinde, Tim Schmielau şunları yazmıştı: [...] > > Note that these dd "benchmarks" are completely bogus, because the data=20 > doesn't actually get written to disk in that time. For some enlightening=20 > data, try > > time dd if=3D/dev/zero of=3D/tmp/1GB bs=3D1M count=3D1024; time sync > > The dd returns as soon as all data could be buffered in RAM. Only sync=20 > will show how long it takes to actually write out the data to disk. > also explains why you see better results is writeout starts earlier. Still not that bad: [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024;sync 1024+0 records in 1024+0 records out 1073741824 bytes (1,1 GB) copied, 53,3194 s, 20,1 MB/s real0m53.517s user0m0.003s sys 0m3.193s - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
20 Oca 2007 Cts 22:05 tarihinde, Willy Tarreau şunları yazmıştı: > On Sun, Jan 21, 2007 at 01:14:41AM +0530, Sunil Naidu wrote: > > On 1/20/07, Willy Tarreau <[EMAIL PROTECTED]> wrote: [...] > It should not have changed the throughput at all if the > hardware was not a bit strange (well, it's a VAIO after all, I've > had one too, fortunately it died one month after the warranty, > putting an end to all my problems...). How true is this, VAIO sucks on Linux :'( Regards, ismail - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Sat, 20 Jan 2007, Ismail Dönmez wrote: > 20 Oca 2007 Cts 19:45 tarihinde şunları yazmıştınız: > [...] > > > vaio cartman # hdparm -tT /dev/hda > > > > > > /dev/hda: > > > Timing cached reads: 1576 MB in 2.00 seconds = 788.18 MB/sec > > > Timing buffered disk reads: 74 MB in 3.01 seconds = 24.55 MB/sec > > > > > > > > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > > > 1024+0 records in > > > 1024+0 records out > > > 1073741824 bytes (1,1 GB) copied, 77,2809 s, 13,9 MB/s > > > > > > real1m17.482s > > > user0m0.003s > > > sys 0m2.350s > > > > That's not bad at all ! I suspect that if your system becomes unresponsive, > > it's because real writes start when the cache is full. And if you fill > > 512 MB of RAM with data that you then need to flush on disk at 14 MB/s, it > > can take about 40 seconds during which it might be difficult to do > > anything. > > > > Try lowering the cache flush starting point to about 10 MB if you want > > (2% of 512 MB) : > > > > # echo 2 >/proc/sys/vm/dirty_ratio > > # echo 2 >/proc/sys/vm/dirty_background_ratio > > After that I get, > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1,1 GB) copied, 41,7005 s, 25,7 MB/s > > real0m41.926s > user0m0.007s > sys 0m2.500s > > > not bad! thanks :) Note that these dd "benchmarks" are completely bogus, because the data doesn't actually get written to disk in that time. For some enlightening data, try time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024; time sync The dd returns as soon as all data could be buffered in RAM. Only sync will show how long it takes to actually write out the data to disk. also explains why you see better results is writeout starts earlier. Tim
Re: Weird harddisk behaviour
when there is no longer anything to take away. -- Antoine de Saint-Exupery Date: Sat, 20 Jan 2007 20:58:17 +0100 In-Reply-To: <[EMAIL PROTECTED]> (Ken Moffat's message of "Thu, 18 Jan 2007 00:18:38 +") Message-ID: <[EMAIL PROTECTED]> User-Agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/20.7 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Quoting Ken Moffat <[EMAIL PROTECTED]>: > So, you were using a valid tool, but what you put in your original > mail shows garbage - something like apple_partition_ma[mamama... That's what fdisk showed me. I don't have a true UTF-8 system, so when cutting-and-pasing between the shell and mail app, it might have been distorted. But the output WAS weird! > But if it isn't, somehow the data on the disk (or the data being > read from it) is corrupt. So how come it got corrupt? I did a 'dd if=/dev/zero of=/dev/sdb' (and waited for the whole disk to be zeroed - took HOURS! :). Then partitioned the disk and cut-and-pasted the output into the mail... EVERY time I check the partition table (using mac-fdisk and not cfdisk), that destorted output is shown. It's not distorted/corrupt if I use cfdisk though... Since I don't exactly know how to do all this with the tools in Linux, I took the disk to my girlfriends WinXP and is currently using 'OnTrack EasyRecorvery Professional - ERP' to do scanns and tests of the disk. I tried parition and format it there to, but the format failed (no reason why). I'm currently running the extended S.M.A.R.T. test. And there where other tests I could do to... I'll let you know. One weird thing though... ERP only found it as a 137Mb disk! It's supposed to be a 400Gb... -- Ft. Bragg ammonium genetic Ortega Nazi Uzi FSF Cocaine North Korea Cuba Delta Force Qaddafi Treasury kibo Marxist [See http://www.aclu.org/echelonwatch/index.html for more about this] [Or http://www.europarl.eu.int/tempcom/echelon/pdf/rapport_echelon_en.pdf] If neither of these works, try http://www.aclu.org and search for echelon. Note. This is a real, not fiction. http://www.theregister.co.uk/2001/09/06/eu_releases_echelon_spying_report/ http://www.aclu.org/safefree/nsaspying/23989res20060131.html#echelon - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Sat, Jan 20, 2007 at 02:56:20PM -0500, Stephen Clark wrote: > Sunil Naidu wrote: > > >On 1/20/07, Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > > > >>It is not expected to increase write performance, but it should help > >>you do something else during that time, or also give more responsiveness > >>to Ctrl-C. It is possible that you have fast and slow RAM, or that your > >>video card uses shared memory which slows down some parts of memory > >>which are not used anymore with those parameters. > >> > >> > > > >I did test some SATA drives, am getting these value for 2.6.20-rc5:- > > > >[EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > >1024+0 records in > >1024+0 records out > >1073741824 bytes (1.1 GB) copied, 21.0962 seconds, 50.9 MB/s > > > >What can you suggest here w.r.t my RAM & disk? > > > > > > > >>Willy > >> > >> > > > >Thanks, > > > >~Akula2 > >- > >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > >the body of a message to [EMAIL PROTECTED] > >More majordomo info at http://vger.kernel.org/majordomo-info.html > >Please read the FAQ at http://www.tux.org/lkml/ > > > > > > > Hi, > whitebook vbi s96f core 2 duo t5600 2gb hitachi ATA HTS721060G9AT00 > using libata > time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 10.0092 seconds, 107 MB/s > > real0m10.196s > user0m0.004s > sys 0m3.440s You have too much RAM, it's possible that writes did not complete before the end of your measurement. Try this instead : $ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 | sync Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Sun, Jan 21, 2007 at 01:14:41AM +0530, Sunil Naidu wrote: > On 1/20/07, Willy Tarreau <[EMAIL PROTECTED]> wrote: > >> > > > > > >It is not expected to increase write performance, but it should help > >you do something else during that time, or also give more responsiveness > >to Ctrl-C. It is possible that you have fast and slow RAM, or that your > >video card uses shared memory which slows down some parts of memory > >which are not used anymore with those parameters. > > I did test some SATA drives, am getting these value for 2.6.20-rc5:- > > [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 21.0962 seconds, 50.9 MB/s > > What can you suggest here w.r.t my RAM & disk? I don't suggest anything, this is already very good. The only goal of reducing memory write cache is to get a more responsive system when dumping massive amounts of data to disks, like above, because when the system starts flushing the caches, you can only wait for it to finish. But those tests are not realistic loads. A desktop and most servers will benefit from large caches. But *some* workloads will benefit from smaller caches if they consist in writing continuous streams (eg: tcpdump or video recorders). What I suggested to the user above was a way to get the system more responsive during his test. It should not have changed the throughput at all if the hardware was not a bit strange (well, it's a VAIO after all, I've had one too, fortunately it died one month after the warranty, putting an end to all my problems...). Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
Ian Kumlien wrote: Hi, I went from 2.6.19+sata_nv-adma-ncq-v7.patch, with no problems and adama enabled, to 2.6.20-rc5, which gave me problems almost instantly. I just thought that it might be interesting to know that it DID work nicely. CC since i'm not on the ml (I'm ccing more of the people who reported this) Well that's interesting.. The only significant change that went into 2.6.20-rc5 in that driver that wasn't in that version you mentioned was this one: http://www2.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2dec7555e6bf2772749113ea0ad454fcdb8cf861 Could you (or anyone else) test what happens if you take the 2.6.20-rc5 version of sata_nv.c and try it on 2.6.19? That would tell us whether it's this change or whether it's something else (i.e. in libata core). Assuming that still doesn't work, can you then try removing these lines from nv_host_intr in 2.6.20-rc5 sata_nv.c and see what that does? /* bail out if not our interrupt */ if (!(irq_stat & NV_INT_DEV)) return 0; as that's the difference I'm most suspicious of causing the problem. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
Sunil Naidu wrote: On 1/20/07, Willy Tarreau <[EMAIL PROTECTED]> wrote: It is not expected to increase write performance, but it should help you do something else during that time, or also give more responsiveness to Ctrl-C. It is possible that you have fast and slow RAM, or that your video card uses shared memory which slows down some parts of memory which are not used anymore with those parameters. I did test some SATA drives, am getting these value for 2.6.20-rc5:- [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 21.0962 seconds, 50.9 MB/s What can you suggest here w.r.t my RAM & disk? Willy Thanks, ~Akula2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ Hi, whitebook vbi s96f core 2 duo t5600 2gb hitachi ATA HTS721060G9AT00 using libata time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 10.0092 seconds, 107 MB/s real0m10.196s user0m0.004s sys 0m3.440s -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On 1/20/07, Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > It is not expected to increase write performance, but it should help you do something else during that time, or also give more responsiveness to Ctrl-C. It is possible that you have fast and slow RAM, or that your video card uses shared memory which slows down some parts of memory which are not used anymore with those parameters. I did test some SATA drives, am getting these value for 2.6.20-rc5:- [EMAIL PROTECTED] ~]$ time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 21.0962 seconds, 50.9 MB/s What can you suggest here w.r.t my RAM & disk? Willy Thanks, ~Akula2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] Explicitly set pgid/sid of init
Pls treat this patch as Patch 2/2 where Patch 1/2 is http://lkml.org/lkml/2007/1/19/159 --- From: Sukadev Bhattiprolu <[EMAIL PROTECTED]> Explicitly set pgid and sid of init process to 1. Signed-off-by: Sukadev Bhattiprolu <[EMAIL PROTECTED]> Cc: Cedric Le Goater <[EMAIL PROTECTED]> Cc: Dave Hansen <[EMAIL PROTECTED]> Cc: Serge Hallyn <[EMAIL PROTECTED]> Cc: Eric Biederman <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] --- init/main.c |1 + kernel/exit.c |4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) Index: lx26-20-rc4-mm1/init/main.c === --- lx26-20-rc4-mm1.orig/init/main.c2007-01-20 11:12:00.803672744 -0800 +++ lx26-20-rc4-mm1/init/main.c 2007-01-20 11:12:20.786634872 -0800 @@ -774,6 +774,7 @@ static int __init init(void * unused) */ init_pid_ns.child_reaper = current; + __set_special_pids(1, 1); cad_pid = task_pid(current); smp_prepare_cpus(max_cpus); Index: lx26-20-rc4-mm1/kernel/exit.c === --- lx26-20-rc4-mm1.orig/kernel/exit.c 2007-01-20 11:12:00.803672744 -0800 +++ lx26-20-rc4-mm1/kernel/exit.c 2007-01-20 11:12:20.787634720 -0800 @@ -297,12 +297,12 @@ void __set_special_pids(pid_t session, p { struct task_struct *curr = current->group_leader; - if (process_session(curr) != session) { + if (pid_nr(task_session(curr)) != session) { detach_pid(curr, PIDTYPE_SID); set_signal_session(curr->signal, session); attach_pid(curr, PIDTYPE_SID, find_pid(session)); } - if (process_group(curr) != pgrp) { + if (pid_nr(task_pgrp(curr)) != pgrp) { detach_pid(curr, PIDTYPE_PGID); curr->signal->pgrp = pgrp; attach_pid(curr, PIDTYPE_PGID, find_pid(pgrp)); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MSI failure on nForce 430 (WAS: intel 82571EB gigabit fails to see link on 2.6.20-rc5 in-tree e1000 driver (regression))
Adam Kropelin wrote: I've attached the contents dmesg, 'lspci -vvv', and 'cat /proc/interrupts' from 2.6.20-rc5. Actually attached this time. --Adam proc-irq-2.6.20-rc5 Description: Binary data dmesg-2.6.20-rc5 Description: Binary data lspci-2.6.20-rc5 Description: Binary data
MSI failure on nForce 430 (WAS: intel 82571EB gigabit fails to see link on 2.6.20-rc5 in-tree e1000 driver (regression))
(cc: list trimmed and thread moved to linux-pci) I have a PCI-E e1000 card that does not see interrupts on 2.6.20-rc5 unless CONFIG_PCI_MSI is disabled. An e1000 maintainer indicated that the PHY state is correct, it's just that the interrupt is not getting thru to the kernel. Interestingly, on 2.6.19 PHY interrupts get thru ok with MSI enabled (link status responds appropriately) but packet tx fails with timeout errors, implying that perhaps MAC interrupts are not arriving. I've attached the contents dmesg, 'lspci -vvv', and 'cat /proc/interrupts' from 2.6.20-rc5. This is an nForce 430 based chipset on a Dell E521 which has had interrupt routing issues before. Prior to 2.6.19 it had to be booted with 'noapic' in order to come up at all. It also had USB lockup problems until I applied the latest BIOS update (v1.1.4). So a BIOS interrupt routing bug with MSI is not out of the question. I'm happy to gather more data or run tests... --Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
BUG: incorrect MD5 hash value on x86_64 with TCP-MD5
Hello all, There's still a bug in the new TCP-MD5 feature. On x86_64, the hash function is fed wrong TCP payload content. The same kernel, same conf (except arch -> x86) on an Athlon doesn't have the problem. Kernel is a vanilla 2.6.20-rc5. I put debugging printk()s into tcp_v4_do_calc_md5_hash() and logged every byte fed into MD5. Here's one packet sniffed from the wire: 00 30 05 27 38 90 00 04 23 d7 af ed 08 00 45 00 0010 00 69 b8 3c 40 00 40 06 86 a1 d4 57 21 04 d4 57 0020 31 fe a3 be 00 b3 bd b9 3f 14 5b e1 17 5e a0 18 0030 16 d0 2e 0b 00 00 01 01 13 12 13 e2 0e db db e0 0040 46 e3 67 02 17 38 ca e7 1f 1f ff ff ff ff ff ff 0050 ff ff ff ff ff ff ff ff ff ff 00 2d 01 04 30 e0 0060 00 b4 d4 57 21 04 10 02 06 01 04 00 01 00 01 02 0070 02 80 00 02 02 02 00 The TCP payload starts at all those FFs. Here's what the hash function sees: dump of tcp pseudo-header (12 byte): d4 57 21 04 d4 57 31 fe 00 06 00 55 that's correct: 212.87.33.4 -> 212.87.49.254, protocol 6=TCP, 0x55 bytes dump of tcp header without options (20 byte): a3 be 00 b3 bd b9 3f 14 5b e1 17 5e a0 18 16 d0 00 00 00 00 also correct: copied verbatim from the packet, except the checksum is set to zero. dump of data (45 byte): 02 00 01 00 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30 37 ea 7f 00 81 ff ff 00 00 2d 00 03 00 00 00 01 00 00 00 00 TOTAL RUBBISH! I don't know where this comes from, or why it is found after the TCP header. It doesn't appear in the packet either. dump of password (13 byte): 73 63 68 6e 61 62 65 6c 74 61 73 73 65 that's my test password all right. dump of md5 hash result (16 byte): 13 e2 0e db db e0 46 e3 67 02 17 38 ca e7 1f 1f same as sniffed, so I didn't mix up packets. I tried passing the whole skb* into tcp_v4_do_calc_md5_hash and dumping stuff from various locations in the packet, but didn't find the real payload. Please help, I'm at my wits end here... Regards, Torsten - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
How to use an usb interface than is claimed by HID?
Hello everyone, I'm writing a driver for an USB device that has one configuration with several interfacies and one of them is a HID interface. So when I check this interface whether it's claimed (usb_interface_claimed), I find out that it is, and it's claimed by the HID driver. So here is the question: how can I ask the HID driver to unclaim this very interface for me so that I can use it? The HID driver is needed for some other devices, so I can't just rmmod it. Thanks in advice. Regards, Ivan Ukhov. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
On Saturday, 20. January 2007 03:41, Robert Hancock wrote: > Alistair John Strachan wrote: > > On Tuesday 16 January 2007 01:53, Jeff Garzik wrote: > >> Robert Hancock wrote: > >>> I'll try your stress test when I get a chance, but I doubt I'll run > >>> into the same problem and I haven't seen any similar reports. Perhaps > >>> it's some kind of wierd timing issue or incompatibility between the > >>> controller and that drive when running in ADMA mode? I seem to remember > >>> various reports of issues with certain Maxtor drives and some nForce > >>> SATA controllers under Windows at least.. > >> > >> Just to eliminate things, has disabling ADMA been attempted? > >> > >> It can be disabled using the sata_nv.adma module parameter. > > > > Setting this option fixes the problem for me. I suggest that ADMA > > defaults off in 2.6.20, if there's still time to do that. > > Can you guys that are having this problem try the attached debug patch? > It's possible it will fix the problem, as I'm trying a private > exec_command implementation that flushes the write by reading a > controller register instead of reading altstatus from the drive like the > libata core code does. > > If the problem still happens, I also added some more debugging in to > help figure out what is going on, so please post full dmesg. > > By the way, I assume that you guys are using reiserfs or xfs, as it > appears no other file systems issue flush commands automatically. I had > to test this by "echo 1 > delete" on the SCSI disk in sysfs, as I am > using ext3. Yes, I've some reiserfs partitions, but I don't think it's reiserfs fault ;). Here is the log. (I cut out some parts, because it was too big.) BTW: please CC, I'm not on the list! 18:17:29 sys kernel: Linux version 2.6.20-rc5 ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #2 SMP PREEMPT Sat 18:07:36 CET 2007 18:17:29 sys kernel: Command line: root=/dev/md1 ro 18:17:29 sys kernel: BIOS-provided physical RAM map: 18:17:29 sys kernel: BIOS-e820: - 0009f800 (usable) 18:17:29 sys kernel: BIOS-e820: 0009f800 - 000a (reserved) 18:17:29 sys kernel: BIOS-e820: 000f - 0010 (reserved) 18:17:29 sys kernel: BIOS-e820: 0010 - 7fff (usable) 18:17:29 sys kernel: BIOS-e820: 7fff - 7fff3000 (ACPI NVS) 18:17:29 sys kernel: BIOS-e820: 7fff3000 - 8000 (ACPI data) 18:17:29 sys kernel: BIOS-e820: e000 - f000 (reserved) 18:17:29 sys kernel: BIOS-e820: fec0 - 0001 (reserved) 18:17:29 sys kernel: Entering add_active_range(0, 0, 159) 0 entries of 256 used 18:17:29 sys kernel: Entering add_active_range(0, 256, 524272) 1 entries of 256 used 18:17:29 sys kernel: end_pfn_map = 1048576 18:17:29 sys kernel: DMI 2.3 present. 18:17:29 sys kernel: ACPI: RSDP (v000 Nvidia) @ 0x000f7d30 18:17:29 sys kernel: ACPI: RSDT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 0x7fff3040 18:17:29 sys kernel: ACPI: FADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 0x7fff30c0 18:17:29 sys kernel: ACPI: SSDT (v001 PTLTD POWERNOW 0x0001 LTP 0x0001) @ 0x7fff9900 18:17:29 sys kernel: ACPI: SRAT (v001 AMDHAMMER 0x0001 AMD 0x0001) @ 0x7fff9b40 18:17:29 sys kernel: ACPI: MCFG (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 0x7fff9c40 18:17:29 sys kernel: ACPI: MADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 0x7fff9840 18:17:29 sys kernel: ACPI: DSDT (v001 NVIDIA AWRDACPI 0x1000 MSFT 0x010e) @ 0x 18:17:29 sys kernel: Entering add_active_range(0, 0, 159) 0 entries of 256 used 18:17:29 sys kernel: Entering add_active_range(0, 256, 524272) 1 entries of 256 used 18:17:29 sys kernel: Zone PFN ranges: 18:17:29 sys kernel: DMA 0 -> 4096 18:17:29 sys kernel: DMA324096 -> 1048576 18:17:29 sys kernel: Normal1048576 -> 1048576 18:17:29 sys kernel: early_node_map[2] active PFN ranges 18:17:29 sys kernel: 0:0 -> 159 18:17:29 sys kernel: 0: 256 -> 524272 18:17:29 sys kernel: On node 0 totalpages: 524175 18:17:29 sys kernel: DMA zone: 56 pages used for memmap 18:17:29 sys kernel: DMA zone: 10 pages reserved 18:17:29 sys kernel: DMA zone: 3933 pages, LIFO batch:0 18:17:29 sys kernel: DMA32 zone: 7111 pages used for memmap 18:17:29 sys kernel: DMA32 zone: 513065 pages, LIFO batch:31 18:17:29 sys kernel: Normal zone: 0 pages used for memmap 18:17:29 sys kernel: Nvidia board detected. Ignoring ACPI timer override. 18:17:29 sys kernel: If you got timer trouble try acpi_use_timer_override 18:17:29 sys kernel: ACPI: PM-Timer IO Port: 0x4008 18:17:29 sys kernel: ACPI: Local APIC address 0xfee0 18:17:29 sys kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) 18:17:29 sys kernel: Processor #0 (Bootup-CPU) 18:17:29 sys kernel: ACPI:
[PATCH] Remove final reference to superfluous smp_commence().
Remove the last (and commented out) invocation of the obsolete smp_commence() call. Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]> --- diff --git a/init/main.c b/init/main.c index 8b4a7d7..4e88bdd 100644 --- a/init/main.c +++ b/init/main.c @@ -395,11 +395,6 @@ static void __init smp_init(void) /* Any cleanup work */ printk(KERN_INFO "Brought up %ld CPUs\n", (long)num_online_cpus()); smp_cpus_done(max_cpus); -#if 0 - /* Get other processors into their bootup holding patterns. */ - - smp_commence(); -#endif } #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 8/15] ide: disable DMA in ->ide_dma_check for "no IORDY" case
Hello. Bartlomiej Zolnierkiewicz wrote: The other advantage of doing cleanups is that code becomes cleaner/simpler which matters a lot for this codebase, i.e. ide-dma-off-void.patch exposed (yet to be fixed) bug in set_using_dma() (->ide_dma_off_quietly always returns 0 which is passed by ->ide_dma_check to set_using_dma() which incorrectly then calls ->ide_dma_on). Well, this seems a newly intruduced bug. The old code is so convulted that it is hard to see it w/o cleanup. :) ->ide_dma_check implementations often do return hwif->ide_dma_off_quietly(drive); so the return value of ide_dma_off_quietly() (which is always 0) is passed to if (HWIF(drive)->ide_dma_check(drive)) return -EIO; in ide.c:set_using_dma() -> as a result the next line is executed if (HWIF(drive)->ide_dma_on(drive)) return -EIO; Ah, indeed! Nice. :-) It's all fine but goes somewhat against Linus' policy as far as I understnad it: fixes are merged all the time while cleanups (along with new code) are merged mostly duting the merge window. Moreover I don't find the current tree to be more of cleanups than fixes, here is the analysis of current series file: Maybe I slightly exaggerated, being impressed by the volume of your recent changes. :-) But still... # # IDE patches from 2.6.20-rc3-mm1 # toshiba-tc86c001-ide-driver-take-2.patch toshiba-tc86c001-ide-driver-take-2-fix.patch toshiba-tc86c001-ide-driver-take-2-fix-2.patch -- new driver I'd count that as cleanup, since it's definitely not fix. ;-) hpt3xx-rework-rate-filtering.patch hpt3xx-rework-rate-filtering-tidy.patch hpt3xx-print-the-real-chip-name-at-startup.patch hpt3xx-switch-to-using-pci_get_slot.patch hpt3xx-cache-channels-mcr-address.patch hpt3x7-merge-speedproc-handlers.patch hpt370-clean-up-dma-timeout-handling.patch hpt3xx-init-code-rewrite.patch piix-fix-82371mx-enablebits.patch piix-tuneproc-fixes-cleanups.patch slc90e66-carry-over-fixes-from-piix-driver.patch hpt36x-pci-clock-detection-fix.patch jmicron-warning-fix.patch -- fixes (but most have cleanups mixed in) Yeah, but not that those came in from the -mm tree. Oops, "not that" didn't make sense here :-) ide-mmio-flag.patch -- cleanup hpt34x-tune-chipset-fix.patch -- fix ide-fix-pio-fallback.patch -- fix Those 2 are seem more of a cleanup to me... They fix real but quite hard to hit bugs. I'll put them at the end of the fixes series. Well, most of the recent fixes were for such issues. Nobody had screamed about them, it took a code review to find them. :-) ide-set-dma-helper.patch ide-dma-off-void.patch ide-dma-host-on-void.patch ide-fix-dma-masks.patch ide-max-dma-mode.patch ide-tune-dma-helper.patch -- cleanups Would make sense to keep those last in the tail of queue because of the amount of changes they introduce. Possibly even IDE subsystem wide cleanups They are at the end already - no problem here. :) I meant "in the future"... and if you would like me to shuffle ordering of the patches (but without need of rewritting them) it also OK Erm, no talking about the rewrite but that way you may have to rebase cleanups on top of fixes. This seems unavoidble though due to the way the kernel patch acceptance process is working, as far as I understand it... I'll change the ordering of the patches based on your suggestions and generally try to keep such order of fixes first and cleanups later. Thanks. :-) Index: b/drivers/ide/pci/cmd64x.c === --- a/drivers/ide/pci/cmd64x.c +++ b/drivers/ide/pci/cmd64x.c @@ -479,12 +479,10 @@ static int cmd64x_config_drive_for_dma ( if (ide_use_dma(drive) && config_chipset_for_dma(drive)) return hwif->ide_dma_on(drive); - if (ide_use_fast_pio(drive)) { + if (ide_use_fast_pio(drive)) config_chipset_for_pio(drive, 1); This function will always set PIO mode 4. Mess. Yep. I'm going to send the patch for both this and siimage.c... OK Not sure if I'll be able to find a card to test it soon though (I prefer to test my stuff before submitting, even the simple changes :-). Please send it anyway. Ugh, this one is more tough than pdc202xx_old.c -- since tuneproc() is also borken (doesn't set drive's own transfer mode). And... I looked into speedproc() handler, then into PCI0646U datasheet for reference and was really terrified: the code for SW/DW DMA setup us utter nonsense! It writes to some reserved bits of BMIDE status reg. instead of doinf the real setup, and twiddles the drive 0/1 DMA capable bit which nobody asks it to do... Really messy mess. :-( Thanks, Bart WBR, Sergei - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at
Re: [PATCH 8/15] ide: disable DMA in ->ide_dma_check for "no IORDY" case
Hi, Sergei Shtylyov wrote: > Bartlomiej Zolnierkiewicz wrote: > > I've looked thru the code, and found more issues with the PIO fallback > there. Will try to cook up patches for at least some drivers... > Great, if possible please base them on top of the IDE tree... > >>> Erm, I had doubts about it (having in mind that all that code is more of a >>> cleanups than fixes). Maybe it'd be a good idea to separate the fix and >>> cleanup series somehow... > >> I generally tend do cleanups as a groundwork for the real fixes and separate >> cleanups and fixes to have good base for dealing with regressions. Often all >> changes (cleanups/fixes) could be included in one patch but then I would have >> had harsh times when debugging the regressions. It matters a lot if you hit >> an unknown (or known but the documentation is covered by NDA) hardware bug >> - you can concentrate on a small patch changing the way in which hardware is >> accessed instead of that big patch moving code around etc. > >> Also the thing is that the same bugs are propagated over many drivers so >> doing >> cleanups which merge code before fixing the bug makes sense. We can then fix >> the damn bug once and for all and not worry about somebody copy-n-pasting >> the bug from the yet-to-be-fixed driver (i.e. in the next patch IDE update >> there will be patch to check return value of ->speedproc in ide_tune_dma(), >> without ide-fix-dma-mask/ide-max-dma-mode/ide-tune-dma-helper patches >> I would have to go over all drivers to fix this bug and still there won't >> be a guarantee that same bug wouldn't be introduced in some new driver). > >> The other advantage of doing cleanups is that code becomes cleaner/simpler >> which matters a lot for this codebase, i.e. ide-dma-off-void.patch exposed >> (yet to be fixed) bug in set_using_dma() (->ide_dma_off_quietly always >> returns >> 0 which is passed by ->ide_dma_check to set_using_dma() which incorrectly >> then calls ->ide_dma_on). > >Well, this seems a newly intruduced bug. The old code is so convulted that it is hard to see it w/o cleanup. :) ->ide_dma_check implementations often do return hwif->ide_dma_off_quietly(drive); so the return value of ide_dma_off_quietly() (which is always 0) is passed to if (HWIF(drive)->ide_dma_check(drive)) return -EIO; in ide.c:set_using_dma() -> as a result the next line is executed if (HWIF(drive)->ide_dma_on(drive)) return -EIO; >It's all fine but goes somewhat against Linus' policy as far as I > understnad it: fixes are merged all the time while cleanups (along with new > code) are merged mostly duting the merge window. > >> Moreover I don't find the current tree to be more of cleanups than fixes, >> here is the analysis of current series file: > >Maybe I slightly exaggerated, being impressed by the volume of your recent > changes. :-) >But still... > >> # >> # IDE patches from 2.6.20-rc3-mm1 >> # >> toshiba-tc86c001-ide-driver-take-2.patch >> toshiba-tc86c001-ide-driver-take-2-fix.patch >> toshiba-tc86c001-ide-driver-take-2-fix-2.patch >> -- new driver > > I'd count that as cleanup, since it's definitely not fix. ;-) > >> hpt3xx-rework-rate-filtering.patch >> hpt3xx-rework-rate-filtering-tidy.patch >> hpt3xx-print-the-real-chip-name-at-startup.patch >> hpt3xx-switch-to-using-pci_get_slot.patch >> hpt3xx-cache-channels-mcr-address.patch >> hpt3x7-merge-speedproc-handlers.patch >> hpt370-clean-up-dma-timeout-handling.patch >> hpt3xx-init-code-rewrite.patch >> piix-fix-82371mx-enablebits.patch >> piix-tuneproc-fixes-cleanups.patch >> slc90e66-carry-over-fixes-from-piix-driver.patch >> hpt36x-pci-clock-detection-fix.patch >> jmicron-warning-fix.patch >> -- fixes (but most have cleanups mixed in) > >Yeah, but not that those came in from the -mm tree. > >> pdc202xx_new-remove-useless-code.patch >> pdc202xx_-remove-check_in_drive_lists-abomination.patch >> -- cleanups >> # >> # IDE patches applied by Andrew (2.6.20-rc4-mm1) >> # >> atiixpc-remove-unused-code.patch >> -- cleanup >> atiixpc-sb600-ide-only-has-one-channel.patch >> atiixpc-add-cable-detection-support-for-ati-ide.patch >> ide-generic-jmicron-has-its-own-drivers-now.patch >> -- fixes > >Same about these 3. > >> ide-maintainers-entry.patch >> -- n/a >> # >> # IT8213 >> # >> it8213-ide-driver.patch >> it8213-ide-driver-update.patch >> -- new driver >> # >> # patches posted on Jan 11 2007 >> # >> ia64-pci_get_legacy_ide_irq.patch >> ide-pci-init-tags.patch >> -- fixes >> pdc202xx_old-dead-code.patch >> au1xxx-dead-code.patch >> ide-pio-blacklisted.patch >> ide-no-dsc-flag.patch >> trm290-dma-ifdefs.patch >> ide-pci-device-tables.patch >> ide-dev-openers.patch >> hpt366-init-dma.patch >> cs5530-cleanup.patch >> svwks-cleanup.patch >> sis5513-config-xfer-rate.patch >> ide-set-xfer-rate.patch >> ide-use-fast-pio-v2.patch >>
Re: PROBLEM: KB->KiB, MB -> MiB, ... (IEC 60027-2)
Hello, On 1/20/07, David Schwartz <[EMAIL PROTECTED]> wrote: > [1.] One line summary of the problem: > KB->KiB, MB -> MiB, ... (IEC 60027-2 Letter symbols to be used in > electrical > technology – Part 2) > Should be everywere KiB, MiB, GiB, ... according to IEC 60027-2 Bytes are not an SI unit. A "megabyte" doesn't have to be a million bytes any more than a "megaphone" has to be a million phones. A "megabyte" is 1,048,576 bytes. The "mega" in there is not an SI prefix. "Mega" is only an SI prefix when it appears before an SI unit. Nice observation, however, it still leaves quite an amount of internal inconsistencies in the kernel output. One way of getting rid of those inconsistencies would be to follow IEC 60027-2 for those cases where SI is inappropriate. Leon. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
20 Oca 2007 Cts 20:03 tarihinde şunları yazmıştınız: > On Sat, Jan 20, 2007 at 07:52:53PM +0200, Ismail Dönmez wrote: > > 20 Oca 2007 Cts 19:45 tarihinde ??unlar?? yazmt??n??z: > > [...] > > > > > > vaio cartman # hdparm -tT /dev/hda > > > > > > > > /dev/hda: > > > > Timing cached reads: 1576 MB in 2.00 seconds = 788.18 MB/sec > > > > Timing buffered disk reads: 74 MB in 3.01 seconds = 24.55 MB/sec > > > > > > > > > > > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > > > > 1024+0 records in > > > > 1024+0 records out > > > > 1073741824 bytes (1,1 GB) copied, 77,2809 s, 13,9 MB/s > > > > > > > > real1m17.482s > > > > user0m0.003s > > > > sys 0m2.350s > > > > > > That's not bad at all ! I suspect that if your system becomes > > > unresponsive, it's because real writes start when the cache is full. > > > And if you fill 512 MB of RAM with data that you then need to flush on > > > disk at 14 MB/s, it can take about 40 seconds during which it might be > > > difficult to do anything. > > > > > > Try lowering the cache flush starting point to about 10 MB if you want > > > (2% of 512 MB) : > > > > > > # echo 2 >/proc/sys/vm/dirty_ratio > > > # echo 2 >/proc/sys/vm/dirty_background_ratio > > > > After that I get, > > > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes (1,1 GB) copied, 41,7005 s, 25,7 MB/s > > > > real0m41.926s > > user0m0.007s > > sys 0m2.500s > > > > > > not bad! thanks :) > > It is not expected to increase write performance, but it should help > you do something else during that time, or also give more responsiveness > to Ctrl-C. It is possible that you have fast and slow RAM, or that your > video card uses shared memory which slows down some parts of memory > which are not used anymore with those parameters. Thanks I will try to upgrade RAM but for now at least responsiveness seems to be better. Regards, ismail - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
On Sat, Jan 20, 2007 at 07:52:53PM +0200, Ismail Dönmez wrote: > 20 Oca 2007 Cts 19:45 tarihinde ??unlar?? yazmt??n??z: > [...] > > > vaio cartman # hdparm -tT /dev/hda > > > > > > /dev/hda: > > > Timing cached reads: 1576 MB in 2.00 seconds = 788.18 MB/sec > > > Timing buffered disk reads: 74 MB in 3.01 seconds = 24.55 MB/sec > > > > > > > > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > > > 1024+0 records in > > > 1024+0 records out > > > 1073741824 bytes (1,1 GB) copied, 77,2809 s, 13,9 MB/s > > > > > > real1m17.482s > > > user0m0.003s > > > sys 0m2.350s > > > > That's not bad at all ! I suspect that if your system becomes unresponsive, > > it's because real writes start when the cache is full. And if you fill > > 512 MB of RAM with data that you then need to flush on disk at 14 MB/s, it > > can take about 40 seconds during which it might be difficult to do > > anything. > > > > Try lowering the cache flush starting point to about 10 MB if you want > > (2% of 512 MB) : > > > > # echo 2 >/proc/sys/vm/dirty_ratio > > # echo 2 >/proc/sys/vm/dirty_background_ratio > > After that I get, > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1,1 GB) copied, 41,7005 s, 25,7 MB/s > > real0m41.926s > user0m0.007s > sys 0m2.500s > > > not bad! thanks :) It is not expected to increase write performance, but it should help you do something else during that time, or also give more responsiveness to Ctrl-C. It is possible that you have fast and slow RAM, or that your video card uses shared memory which slows down some parts of memory which are not used anymore with those parameters. Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
20 Oca 2007 Cts 19:45 tarihinde şunları yazmıştınız: [...] > > vaio cartman # hdparm -tT /dev/hda > > > > /dev/hda: > > Timing cached reads: 1576 MB in 2.00 seconds = 788.18 MB/sec > > Timing buffered disk reads: 74 MB in 3.01 seconds = 24.55 MB/sec > > > > > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes (1,1 GB) copied, 77,2809 s, 13,9 MB/s > > > > real1m17.482s > > user0m0.003s > > sys 0m2.350s > > That's not bad at all ! I suspect that if your system becomes unresponsive, > it's because real writes start when the cache is full. And if you fill > 512 MB of RAM with data that you then need to flush on disk at 14 MB/s, it > can take about 40 seconds during which it might be difficult to do > anything. > > Try lowering the cache flush starting point to about 10 MB if you want > (2% of 512 MB) : > > # echo 2 >/proc/sys/vm/dirty_ratio > # echo 2 >/proc/sys/vm/dirty_background_ratio After that I get, [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1,1 GB) copied, 41,7005 s, 25,7 MB/s real0m41.926s user0m0.007s sys 0m2.500s not bad! thanks :) Regards, ismail - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal disk performance, how to debug?
Hi, On Sat, Jan 20, 2007 at 07:20:53PM +0200, Ismail Dönmez wrote: > Hi all, > > I own a Sony Vaio VGN-FS285B and disk performance to say at least is very > very > slow. Writing 1 GB data makes the laptop unresponsive. Here is some data > identifying the drive. Hope someone can tell me how to debug and find out > whats the problem. > > FWIW since 2.6.16 the problem is same and I didn't test with 2.4 kernels. > Here > is some data. And I tested with xfs & ext3. Both slow slow disk writes. > > vaio cartman # hdparm /dev/hda > > /dev/hda: > multcount= 16 (on) > IO_support = 1 (32-bit) > unmaskirq= 1 (on) > using_dma= 1 (on) > keepsettings = 0 (off) > readonly = 0 (off) > readahead= 256 (on) > geometry = 65535/16/63, sectors = 156301488, start = 0 > vaio cartman # hdparm -i /dev/hda > > /dev/hda: > > Model=FUJITSU MHV2080AT, FwRev=0096, SerialNo=NS56T58270LE > Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } > RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=0 > BuffType=DualPortCache, BuffSize=8192kB, MaxMultSect=16, MultSect=16 > CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=156301488 > IORDY=yes, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} > PIO modes: pio0 pio1 pio2 pio3 pio4 > DMA modes: mdma0 mdma1 mdma2 > UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 udma3 udma4 *udma5 > AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled > Drive conforms to: ATA/ATAPI-6 T13 1410D revision 3a: ATA/ATAPI-2 > ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 > > * signifies the current active mode > > > vaio cartman # hdparm -tT /dev/hda > > /dev/hda: > Timing cached reads: 1576 MB in 2.00 seconds = 788.18 MB/sec > Timing buffered disk reads: 74 MB in 3.01 seconds = 24.55 MB/sec > > > [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1,1 GB) copied, 77,2809 s, 13,9 MB/s > > real1m17.482s > user0m0.003s > sys 0m2.350s That's not bad at all ! I suspect that if your system becomes unresponsive, it's because real writes start when the cache is full. And if you fill 512 MB of RAM with data that you then need to flush on disk at 14 MB/s, it can take about 40 seconds during which it might be difficult to do anything. Try lowering the cache flush starting point to about 10 MB if you want (2% of 512 MB) : # echo 2 >/proc/sys/vm/dirty_ratio # echo 2 >/proc/sys/vm/dirty_background_ratio I see nothing wrong in your measures nor messages. Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Serial port blues
Hi, On Fri, Jan 19, 2007 at 03:37:34PM -0600, Joe Barr wrote: > > I'm forwarding this post by the author of a great little program for > digital amateur radio on Linux, because I'm curious whether or not the > problem he is seeing can be resolved outside the kernel. At least, I see one wrong claim and one unexplored track in his report. The wrong claim : the serial port can only be controled by the kernel. It is totally wrong for true serial ports. If he does not want to use ioctl(), then he can directly program the I/O port. The unexplored track : he talked about nice -20. He did not seem to try playing with sched_setscheduler(). I've been using this with a few programs to get (close to) real-time responsiveness and it gives very good results. Not sure whether it will work for his case, though, but it's easy to try, basically, he just has to add this to the top of his program : #include ... main() { struct sched_param sch; /* see man sched_setscheduler for other options */ sch.sched_priority = 1; if (sched_setscheduler(getpid(), SCHED_FIFO, ) == -1) perror("failed. Got root ?"); /* rest of the program now running with real-time prio */ } Now he must be careful about avoiding busy loops in the rest of the program, or he will have to use the reset button. > All comments welcome on/off list. > > Thanks, > Joe Barr > K1GPL [ rest stripped ] Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ckrm-tech] [PATCH 3/6] containers: Add generic multi-subsystem API to containers
Paul Menage wrote: On 1/11/07, Balbir Singh <[EMAIL PROTECTED]> wrote: to 0. To walk the hierarchy, I have no root now since I do not have any task context. I was wondering if exporting the rootnode or providing a function to export the rootnode of the mounter hierarchy will make programming easier. Ah - I misunderstood what you were looking for before. Here it is, a simple patch to keep track of percentage cpu load of a container. This patch depends on the add mount callbacks patch and another patch that fixes cpuacct for powerpc boxes (posted previously). The patch attempts to add a percentage load calculation for each container. It also maintains an accumulated time counter, which accounts for the total cpu time taken by the container. Compiled and tested on a 4 cpu powerpc box. Paul, please include this in your next series of patches for containers. Signed-off-by: <[EMAIL PROTECTED]> --- include/linux/cpu_acct.h |4 ++ kernel/cpu_acct.c| 90 +++ kernel/sched.c |7 ++- 3 files changed, 93 insertions(+), 8 deletions(-) diff -puN kernel/cpu_acct.c~cpu_acct_load_acct kernel/cpu_acct.c --- linux-2.6.20-rc5/kernel/cpu_acct.c~cpu_acct_load_acct 2007-01-20 18:28:26.0 +0530 +++ linux-2.6.20-rc5-balbir/kernel/cpu_acct.c 2007-01-20 22:32:49.0 +0530 @@ -13,16 +13,22 @@ #include #include #include +#include #include struct cpuacct { struct container_subsys_state css; spinlock_t lock; cputime64_t time; // total time used by this class + cputime64_t accum_time; // total time used by this class }; static struct container_subsys cpuacct_subsys; static struct container *root; +static spinlock_t interval_lock; +static cputime64_t interval_time; +static unsigned long long timestamp; +static unsigned long long interval; static inline struct cpuacct *container_ca(struct container *cont) { @@ -41,6 +47,8 @@ static int cpuacct_create(struct contain if (!ca) return -ENOMEM; spin_lock_init(>lock); cont->subsys[cpuacct_subsys.subsys_id] = >css; + ca->time = cputime64_zero; + ca->accum_time = cputime64_zero; return 0; } @@ -67,17 +75,35 @@ static ssize_t cpuusage_read(struct cont size_t nbytes, loff_t *ppos) { struct cpuacct *ca = container_ca(cont); - cputime64_t time; - unsigned long time_in_jiffies; + unsigned long long time; + unsigned long long accum_time; + unsigned long long interval_jiffies; char usagebuf[64]; char *s = usagebuf; spin_lock_irq(>lock); - time = ca->time; + time = cputime64_to_jiffies64(ca->time); + accum_time = cputime64_to_jiffies64(ca->accum_time); spin_unlock_irq(>lock); - time_in_jiffies = cputime_to_jiffies(time); - s += sprintf(s, "%llu\n", (unsigned long long) time_in_jiffies); + spin_lock_irq(_lock); + interval_jiffies = cputime64_to_jiffies64(interval_time); + spin_unlock_irq(_lock); + + s += sprintf(s, "time %llu\n", time); + s += sprintf(s, "accumulated time %llu\n", accum_time); + s += sprintf(s, "time since interval %llu\n", interval_jiffies); + + /* +* Calculate time in percentage +*/ + time *= 100; + if (interval_jiffies) + do_div(time, interval_jiffies); + else + time = 0; + + s += sprintf(s, "load %llu\n", time); return simple_read_from_buffer(buf, nbytes, ppos, usagebuf, s - usagebuf); } @@ -96,7 +122,6 @@ static int cpuacct_populate(struct conta void cpuacct_charge(struct task_struct *task, cputime_t cputime) { - struct cpuacct *ca; unsigned long flags; @@ -106,11 +131,60 @@ void cpuacct_charge(struct task_struct * if (ca) { spin_lock_irqsave(>lock, flags); ca->time = cputime64_add(ca->time, cputime); + ca->accum_time = cputime64_add(ca->accum_time, cputime); spin_unlock_irqrestore(>lock, flags); } rcu_read_unlock(); } +void cpuacct_uncharge(struct task_struct *task, cputime_t cputime) +{ + struct cpuacct *ca; + unsigned long flags; + + if (cpuacct_subsys.subsys_id < 0 || !root) return; + rcu_read_lock(); + ca = task_ca(task); + if (ca) { + spin_lock_irqsave(>lock, flags); + ca->time = cputime64_sub(ca->time, cputime); + ca->accum_time = cputime64_sub(ca->accum_time, cputime); + spin_unlock_irqrestore(>lock, flags); + } + rcu_read_unlock(); +} + +static void reset_ca_time(struct container *root) +{ + struct container *child; + struct cpuacct *ca; + + if (root) { + ca = container_ca(root); + if (ca) { + spin_lock(>lock); + ca->time =
Abysmal disk performance, how to debug?
Hi all, I own a Sony Vaio VGN-FS285B and disk performance to say at least is very very slow. Writing 1 GB data makes the laptop unresponsive. Here is some data identifying the drive. Hope someone can tell me how to debug and find out whats the problem. FWIW since 2.6.16 the problem is same and I didn't test with 2.4 kernels. Here is some data. And I tested with xfs & ext3. Both slow slow disk writes. vaio cartman # hdparm /dev/hda /dev/hda: multcount= 16 (on) IO_support = 1 (32-bit) unmaskirq= 1 (on) using_dma= 1 (on) keepsettings = 0 (off) readonly = 0 (off) readahead= 256 (on) geometry = 65535/16/63, sectors = 156301488, start = 0 vaio cartman # hdparm -i /dev/hda /dev/hda: Model=FUJITSU MHV2080AT, FwRev=0096, SerialNo=NS56T58270LE Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=0 BuffType=DualPortCache, BuffSize=8192kB, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=156301488 IORDY=yes, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 udma3 udma4 *udma5 AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled Drive conforms to: ATA/ATAPI-6 T13 1410D revision 3a: ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 * signifies the current active mode vaio cartman # hdparm -tT /dev/hda /dev/hda: Timing cached reads: 1576 MB in 2.00 seconds = 788.18 MB/sec Timing buffered disk reads: 74 MB in 3.01 seconds = 24.55 MB/sec [~]> time dd if=/dev/zero of=/tmp/1GB bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1,1 GB) copied, 77,2809 s, 13,9 MB/s real1m17.482s user0m0.003s sys 0m2.350s dmesg follows: PTL 0x005f) @ 0x1f6e9e78 ACPI: FADT (v002 Sony J1 0x20050311 PTL 0x005f) @ 0x1f6e9ee0 ACPI: BOOT (v001 Sony J1 0x20050311 PTL 0x0001) @ 0x1f6e9fd8 ACPI: MCFG (v001 Sony J1 0x20050311 PTL 0x005f) @ 0x1f6e9f9c ACPI: SSDT (v001 Sony J1 0x20050311 PTL 0x20030224) @ 0x1f6e618f ACPI: SSDT (v001 Sony J1 0x20050311 PTL 0x20030224) @ 0x1f6e5d4a ACPI: SSDT (v001 Sony J1 0x20050311 PTL 0x20030224) @ 0x1f6e5b2f ACPI: SSDT (v001 Sony J1 0x20050311 PTL 0x20030224) @ 0x1f6e5916 ACPI: DSDT (v001 Sony J1 0x20050311 PTL 0x20030224) @ 0x ACPI: PM-Timer IO Port: 0x1008 Allocating PCI resources starting at 3000 (gap: 2000:c000) Detected 1792.955 MHz processor. Built 1 zonelists. Total pages: 127731 Kernel command line: root=/dev/hda1 vga=791 mudur=language:tr Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 2048 (order: 11, 8192 bytes) Console: colour dummy device 80x25 Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Memory: 506696k/514944k available (2242k kernel code, 7824k reserved, 734k data, 176k init, 0k highmem) virtual kernel memory layout: fixmap : 0xfffeb000 - 0xf000 ( 80 kB) pkmap : 0xff80 - 0xffc0 (4096 kB) vmalloc : 0xe000 - 0xff7fe000 ( 503 MB) lowmem : 0xc000 - 0xdf6e ( 502 MB) .init : 0xc03ec000 - 0xc0418000 ( 176 kB) .data : 0xc033099f - 0xc03e850c ( 734 kB) .text : 0xc010 - 0xc033099f (2242 kB) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 3462.00 BogoMIPS (lpj=5768004) Security Framework v1.0.0 initialized Mount-cache hash table entries: 512 CPU: After generic identify, caps: afe9fbff 0010 0180 CPU: L1 I cache: 32K, L1 D cache: 32K CPU: L2 cache: 2048K CPU: After all inits, caps: afe9fbff 0010 2040 0180 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Compat vDSO mapped to e000. CPU: Intel(R) Pentium(R) M processor 1.73GHz stepping 08 Checking 'hlt' instruction... OK. ACPI: Core revision 20060707 ACPI: setting ELCR to 0200 (from 0cb8) NET: Registered protocol family 16 No dock devices found. ACPI: bus type pci registered PCI: Using MMCONFIG Setting up standard PCI resources ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (:00) PCI: Probing PCI hardware (bus 00) Boot video device is :00:02.0 PCI quirk: region 1000-107f claimed by ICH6 ACPI/GPIO/TCO PCI quirk: region 1180-11bf claimed by ICH6 GPIO PCI: Firmware left :06:08.0 e100 interrupts enabled, disabling PCI: Transparent bridge - :00:1e.0 PCI: Bus #07 (-#0a) is hidden behind transparent bridge #06 (-#07) (try 'pci=assign-busses') Please report the result to linux-kernel to fix this permanently ACPI: PCI
Re: unable to mmap /dev/kmem
On 1/19/07, Hugh Dickins <[EMAIL PROTECTED]> wrote: Apology surely accepted, it's a confusing area (inevitably: in a driver for mem, the distinction between addresses and offsets become blurred). But please note, the worst of it was, that your patch comment gave no hint that you were knowingly changing its behaviour on the "main" architectures: it reads as if you were simply fixing it up on a few less popular architectures where an anomaly had been missed. Because I was thinking that the expected behaviour was the one implemented before 2.6.12. So I really thought to fix a bug, again sorry for not having checked the history... That said, it's really confusing to pass a virtual address as an offset because: (a) mmap() has always worked with offset not addresses; (b) the kernel will treat this virtual address as an offset until kmem driver convert it back to a virtual address. And it seems that during this convertion the lowest bits of the virtual address will be lost... Maybe read/write behaviours should be changed to use the offset as an offset and not as a virtual address. > > I guess it's reassuring to know that not many are actually > > using mmap of /dev/kmem, so you're the first to notice: thanks. > > yes it doesn't seems to be used. In my case, I was just playing with > it when I submitted the patch but have no real usages. Have I got it right, that actually the problem you thought you were fixing does not even exist? yes, see above. __pa was already doing the right thing on all architectures, wasn't it? So we can simply ask Linus to revert your patch? yes we can if the desired behaviour is the one introduced by 2.6.12. I don't think your PFN_DOWN or virt_to_phys were improvements: though mem.c happens to live in drivers/char/, imagine it under mm/. I don't get your point here. Do you mean that virt_to_phys() is only meant for drivers ? If so, I would have said that virt_to_phys() is prefered once boot memory init is finished... Franck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 8/15] ide: disable DMA in ->ide_dma_check for "no IORDY" case
Hello. Bartlomiej Zolnierkiewicz wrote: I've looked thru the code, and found more issues with the PIO fallback there. Will try to cook up patches for at least some drivers... Great, if possible please base them on top of the IDE tree... Erm, I had doubts about it (having in mind that all that code is more of a cleanups than fixes). Maybe it'd be a good idea to separate the fix and cleanup series somehow... I generally tend do cleanups as a groundwork for the real fixes and separate cleanups and fixes to have good base for dealing with regressions. Often all changes (cleanups/fixes) could be included in one patch but then I would have had harsh times when debugging the regressions. It matters a lot if you hit an unknown (or known but the documentation is covered by NDA) hardware bug - you can concentrate on a small patch changing the way in which hardware is accessed instead of that big patch moving code around etc. Also the thing is that the same bugs are propagated over many drivers so doing cleanups which merge code before fixing the bug makes sense. We can then fix the damn bug once and for all and not worry about somebody copy-n-pasting the bug from the yet-to-be-fixed driver (i.e. in the next patch IDE update there will be patch to check return value of ->speedproc in ide_tune_dma(), without ide-fix-dma-mask/ide-max-dma-mode/ide-tune-dma-helper patches I would have to go over all drivers to fix this bug and still there won't be a guarantee that same bug wouldn't be introduced in some new driver). The other advantage of doing cleanups is that code becomes cleaner/simpler which matters a lot for this codebase, i.e. ide-dma-off-void.patch exposed (yet to be fixed) bug in set_using_dma() (->ide_dma_off_quietly always returns 0 which is passed by ->ide_dma_check to set_using_dma() which incorrectly then calls ->ide_dma_on). Well, this seems a newly intruduced bug. It's all fine but goes somewhat against Linus' policy as far as I understnad it: fixes are merged all the time while cleanups (along with new code) are merged mostly duting the merge window. Moreover I don't find the current tree to be more of cleanups than fixes, here is the analysis of current series file: Maybe I slightly exaggerated, being impressed by the volume of your recent changes. :-) But still... # # IDE patches from 2.6.20-rc3-mm1 # toshiba-tc86c001-ide-driver-take-2.patch toshiba-tc86c001-ide-driver-take-2-fix.patch toshiba-tc86c001-ide-driver-take-2-fix-2.patch -- new driver I'd count that as cleanup, since it's definitely not fix. ;-) hpt3xx-rework-rate-filtering.patch hpt3xx-rework-rate-filtering-tidy.patch hpt3xx-print-the-real-chip-name-at-startup.patch hpt3xx-switch-to-using-pci_get_slot.patch hpt3xx-cache-channels-mcr-address.patch hpt3x7-merge-speedproc-handlers.patch hpt370-clean-up-dma-timeout-handling.patch hpt3xx-init-code-rewrite.patch piix-fix-82371mx-enablebits.patch piix-tuneproc-fixes-cleanups.patch slc90e66-carry-over-fixes-from-piix-driver.patch hpt36x-pci-clock-detection-fix.patch jmicron-warning-fix.patch -- fixes (but most have cleanups mixed in) Yeah, but not that those came in from the -mm tree. pdc202xx_new-remove-useless-code.patch pdc202xx_-remove-check_in_drive_lists-abomination.patch -- cleanups # # IDE patches applied by Andrew (2.6.20-rc4-mm1) # atiixpc-remove-unused-code.patch -- cleanup atiixpc-sb600-ide-only-has-one-channel.patch atiixpc-add-cable-detection-support-for-ati-ide.patch ide-generic-jmicron-has-its-own-drivers-now.patch -- fixes Same about these 3. ide-maintainers-entry.patch -- n/a # # IT8213 # it8213-ide-driver.patch it8213-ide-driver-update.patch -- new driver # # patches posted on Jan 11 2007 # ia64-pci_get_legacy_ide_irq.patch ide-pci-init-tags.patch -- fixes pdc202xx_old-dead-code.patch au1xxx-dead-code.patch ide-pio-blacklisted.patch ide-no-dsc-flag.patch trm290-dma-ifdefs.patch ide-pci-device-tables.patch ide-dev-openers.patch hpt366-init-dma.patch cs5530-cleanup.patch svwks-cleanup.patch sis5513-config-xfer-rate.patch ide-set-xfer-rate.patch ide-use-fast-pio-v2.patch ide-io-cleanup.patch -- cleanups # # Delkin CardBus CF driver (Mark Lord <[EMAIL PROTECTED]>) # delkin_cb-ide-driver.patch -- new driver # # IDE ACPI support (Hannes Reinecke <[EMAIL PROTECTED]>) # ide-acpi-support.patch -- new functionality (fixes PM on some machines) # # ide-pnp exit fix (Tejun Heo <[EMAIL PROTECTED]>) # ide-pnp-exit-fix.patch -- fix # # VIA IDE update (Josepch Chan <[EMAIL PROTECTED]>) # via-ide-update.patch -- fix I'd put fixes before the rewrites and new code... # # patches posted on 18 Jan 2007 # it8213-ide-driver-update-fixes.patch -- fix Well, this is a fix to the newly added driver, so may go anywhere after it... ide-mmio-flag.patch -- cleanup hpt34x-tune-chipset-fix.patch -- fix
Re: O_DIRECT question
On Sunday 14 January 2007 10:11, Nate Diller wrote: > On 1/12/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > Most applications don't get the kind of performance analysis that > Digeo was doing, and even then, it's rather lucky that we caught that. > So I personally think it'd be best for libc or something to simulate > the O_STREAM behavior if you ask for it. That would simplify things > for the most common case, and have the side benefit of reducing the > amount of extra code an application would need in order to take > advantage of that feature. Sounds like you are saying that making O_DIRECT really mean O_STREAM will work for everybody (including db people, except that they will moan a lot about "it isn't _real_ O_DIRECT!!! Linux suxxx"). I don't care about that. -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: O_DIRECT question
On Thursday 11 January 2007 18:13, Michael Tokarev wrote: > example, which isn't quite possible now from userspace. But as long as > O_DIRECT actually writes data before returning from write() call (as it > seems to be the case at least with a normal filesystem on a real block > device - I don't touch corner cases like nfs here), it's pretty much > THE ideal solution, at least from the application (developer) standpoint. Why do you want to wait while 100 megs of data are being written? You _have to_ have threaded db code in order to not waste gobs of CPU time on UP + even with that you eat context switch penalty anyway. I hope you agree that threaded code is not ideal performance-wise - async IO is better. O_DIRECT is strictly sync IO. -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: O_DIRECT question
On Thursday 11 January 2007 16:50, Linus Torvalds wrote: > > On Thu, 11 Jan 2007, Nick Piggin wrote: > > > > Speaking of which, why did we obsolete raw devices? And/or why not just > > go with a minimal O_DIRECT on block device support? Not a rhetorical > > question -- I wasn't involved in the discussions when they happened, so > > I would be interested. > > Lots of people want to put their databases in a file. Partitions really > weren't nearly flexible enough. So the whole raw device or O_DIRECT just > to the block device thing isn't really helping any. > > > O_DIRECT is still crazily racy versus pagecache operations. > > Yes. O_DIRECT is really fundamentally broken. There's just no way to fix > it sanely. Except by teaching people not to use it, and making the normal > paths fast enough (and that _includes_ doing things like dropping caches > more aggressively, but it probably would include more work on the device > queue merging stuff etc etc). What will happen if we just make open ignore O_DIRECT? ;) And then anyone who feels sad about is advised to do it like described here: http://lkml.org/lkml/2002/5/11/58 -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] clockevent driver for arm/pxa2xx
On Sat, 2007-01-20 at 17:08 +0100, Guennadi Liakhovetski wrote: > > static int hpet_next_event(unsigned long delta, > >struct clock_event_device *evt) > > { > > unsigned long cnt; > > > > cnt = hpet_readl(HPET_COUNTER); > > cnt += delta; > > hpet_writel(cnt, HPET_T0_CMP); > > > > return ((long)(hpet_readl(HPET_COUNTER) - cnt ) > 0); > > } > > > > The generic code takes care of the already expired event. > > The thing is - 2.6.20-rc5-rt3 didn't provide clockevent on PXA, so, I took > Sascha's patch instead of my own, which I've been using with 2.6.18, as > his patches were already submitted to various lists and had chances to > become mainline. And strait away it didn't work. The code above seems to > be doing something close to Sascha's patch, so, I expect it would behave > in the same way. And until I introduced a minimum increment for the match > register, it didn't work. I either got hangs, or WARN_ON dumps about "time > warp detected". I think, any timer related code for PXA has to be tested > on real hardware under significant (real-time) load before going upstream. > Haven't tested -rt7 though, so, maybe it is already handled there? No, as I'm reworking clock events a bit and I added the handling for the match register based devices. The above will catch the event in the past and the generic code handles that. Will be on -rt soon. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] remove __devinit markings from rtc_sysfs_add_device()
the sysfs interface from the rtc framework seems to incorrectly label the add function with __devinit ... the proc and dev interfaces do not have this label on their add functions ive been trying to develop a rtc module and it kept crashing ... after debugging it, i'm pretty sure ive traced it back to the devinit markings ... dropping this lets my module load nicely :) the crash would happen after my rtc called rtc_device_register ... down in class_device_add in drivers/base/class.c, the active class interface list is walked and the add function is checked ... if it's non-null (aka in some interface would like to be notified of additions), then it's called with the new device information on my board, this add pointer would seemingly point into garbage because the memory it refers to was freed by the kernel :( -mike pgplKQqGQ4uDn.pgp Description: PGP signature rtc_sysfs_add_device is needed even after dev initialization, so drop __devinit. Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]> diff --git a/drivers/rtc/rtc-sysfs.c b/drivers/rtc/rtc-sysfs.c index 9418a59..2ddd0cf 100644 --- a/drivers/rtc/rtc-sysfs.c +++ b/drivers/rtc/rtc-sysfs.c @@ -78,7 +78,7 @@ static struct attribute_group rtc_attr_group = { .attrs = rtc_attrs, }; -static int __devinit rtc_sysfs_add_device(struct class_device *class_dev, +static int rtc_sysfs_add_device(struct class_device *class_dev, struct class_interface *class_intf) { int err;
Re: [PATCH] Undo some of the pseudo-security madness
At Sat, 20 Jan 2007 17:37:22 +0300, Samium Gromoff wrote: [snip] > So, here we have a buffer-overflow protection technique, which does not > actually protect against buffer overflows[1], breaking valid applications. > > I suggest getting rid of it. i botched it slightly: --- linux/include/linux/personality.h 2007-01-20 17:31:01.0 +0300 +++ linux-sane/include/linux/personality.h 2007-01-20 17:32:50.0 +0300 @@ -40,7 +40,7 @@ * Security-relevant compatibility flags that must be * cleared upon setuid or setgid exec: */ -#define PER_CLEAR_ON_SETID (READ_IMPLIES_EXEC|ADDR_NO_RANDOMIZE) +#define PER_CLEAR_ON_SETID (READ_IMPLIES_EXEC) Signed-off-by: Samium Gromoff <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] clockevent driver for arm/pxa2xx
On Fri, 19 Jan 2007, Thomas Gleixner wrote: > On Fri, 2007-01-19 at 20:13 +0100, Guennadi Liakhovetski wrote: > > > +static u32 clockevent_mode = 0; > > > + > > > +static void pxa_set_next_event(unsigned long evt, > > > + struct clock_event_device *unused) > > > +{ > > > + OSMR0 = OSCR + evt; > > > +} > > > > This doesn't work for me in various nasty ways. Please, check for a > > minimum delay or loop to get ahead of time. See code in the "old" timer > > ISR. See how it unconditionally adds at least 10 ticks... > > I added support for match register based devices and you want to do > something like this: > > static int hpet_next_event(unsigned long delta, >struct clock_event_device *evt) > { > unsigned long cnt; > > cnt = hpet_readl(HPET_COUNTER); > cnt += delta; > hpet_writel(cnt, HPET_T0_CMP); > > return ((long)(hpet_readl(HPET_COUNTER) - cnt ) > 0); > } > > The generic code takes care of the already expired event. The thing is - 2.6.20-rc5-rt3 didn't provide clockevent on PXA, so, I took Sascha's patch instead of my own, which I've been using with 2.6.18, as his patches were already submitted to various lists and had chances to become mainline. And strait away it didn't work. The code above seems to be doing something close to Sascha's patch, so, I expect it would behave in the same way. And until I introduced a minimum increment for the match register, it didn't work. I either got hangs, or WARN_ON dumps about "time warp detected". I think, any timer related code for PXA has to be tested on real hardware under significant (real-time) load before going upstream. Haven't tested -rt7 though, so, maybe it is already handled there? Thanks Guennadi --- Guennadi Liakhovetski - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Serial port blues
I'm forwarding this post by the author of a great little program for digital amateur radio on Linux, because I'm curious whether or not the problem he is seeing can be resolved outside the kernel. All comments welcome on/off list. Thanks, Joe Barr K1GPL -- It's a strange world when proprietary software is not worth stealing, but free software is. --- Begin Message --- I've spent the last day staring at the oscilloscope and pins RTS and DTR on the serial output for 4 different computers running 4 different versions of Linux. Also have exhausted the search on the internet for information regarding both the latency and jitter associated with ioctl calls to the serial driver (both ttyS and ttyUSB). I'm sure it is out there somewhere, I just cannot find it. I am now convinced that the current serial port drivers available to us on the Linux platform WILL NOT support CW and/or RTTY that is software generated in a satisfactory manner. To test the latency and jitter of the ioctl calls to set or clear RTS and / or DTR I built a basic square wave generator with microsecond timing precision. The timing could be derived either from the select system call or by controlled i/o to the sound card. Both provide very precise timing of the program loop. Each time through the loop either the RTS/DTR was set or cleared. The timing jitter for each 1/2 cycle was from 0 to +4 msec. This varied between systems as each had different cpu clock rates. The jitter is caused by the asynchronous response of the kernel to the request to control the port. ioctl requests apparantly do not have a very high priority for the kernel. They are probably just serviced by a first-in first-out interrupt service request loop. That type of jitter is tolerable up to about 20 wpm CW. It totally wipes out the ability to generate an FSK signal on the DTR or RTS pin. Direct access to the serial port(s) is a kernel perogative in Linux. Only kernel level drivers are allowed such port access. So ... bottom line is that all of my attempts over the past couple of months to provide CW and / or FSK output signal have been to fraught with pitfalls. The CW seems OK for slow speed keying, but the FSK seems impossible to achieve. The FSK using the UART is also limited by the Linux operating system and the current drivers. That limitation excludes the use of 45 or 56 baud BAUDOT. Until such time as new information becomes available I am going to comment out all references to CW and / or FSK via RTS/DTR. I also question how useful the FSK on TxD (UART derived) might be to most users since the 45.45 baudrate is not available in the serial port driver. That function will also be commented out. All this should not really come as a surprise since Linux is not a real-time operating system. By the way, I did try the tests with the test program running with nice -20. Not much difference. Sorry folks, but we win some and lose some. 73, Dave, W1HKJ --- End Message ---
Re: [ANNOUNCE] System Inactivity Monitor v1.0
Bill Davidsen <[EMAIL PROTECTED]> writes: Alessandro Di Marco wrote: > Hi all, > > this is a new 2.6.20 module implementing a user inactivity trigger. Basically > it acts as an event sniffer, issuing an ACPI event when no user activity is > detected for more than a certain amount of time. This event can be successively > grabbed and managed by an user-level daemon such as acpid, blanking the screen, > dimming the lcd-panel light à la mac, etc... Any idea how much power this saves? And for the vast rest of us who do run X, this seems to parallel the work of a well-tuned screensaver. This is just a notifier; to make it work as a screensaver you'll have to rely on some external programs. Personally I use smartdimmer to dim my vaio panel. Obviously you can keep your toaster flying, if you like, simply calling the flying-toaster module instead of smartdimmer. Anyway I would use the latter on battery. ;-) Best, -- "What made the deepest impression upon you?" inquired a friend one day of Lincoln, "when you stood in the presence of the Falls of Niagara, the greatest of natural wonders?" "The thing that stuck me most forcibly when I saw the Falls," Lincoln responded with the characteristic deliberation, "was where in the world did all that water come from?" - Author Unknown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/