[RT] Compile error with certain IRQ debug config option combinations
I'm using 2.6.24.2 with -rt1 applied (applies cleanly with some offsets) on x86. I've *not* configured PROVE_LOCKING (too expensive). However, I selected all the latency tracings, timings and histograms, among them CRITICAL_IRQSOFF_TIMING. This implicitly configures TRACE_IRQFLAGS. However, TRACE_IRQFLAGS depends on PROVE_LOCKING, which is off (and not implicitly selected). With TRACE_IRQFLAGS on and PROVE_LOCKING off, compilation of kernel/lockdep.c fails miserably: CC kernel/lockdep.o kernel/lockdep.c: In function 'check_usage_forwards': kernel/lockdep.c:1740: error: 'find_usage_bit' undeclared (first use in this function) kernel/lockdep.c:1740: error: (Each undeclared identifier is reported only once kernel/lockdep.c:1740: error: for each function it appears in.) kernel/lockdep.c:1742: error: implicit declaration of function 'find_usage_forwards' kernel/lockdep.c:1746: error: 'forwards_match' undeclared (first use in this function) kernel/lockdep.c: In function 'check_usage_backwards': kernel/lockdep.c:1759: error: 'find_usage_bit' undeclared (first use in this function) kernel/lockdep.c:1761: error: implicit declaration of function 'find_usage_backwards' kernel/lockdep.c:1765: error: 'backwards_match' undeclared (first use in this function) make[1]: *** [kernel/lockdep.o] Error 1 Looking at the source quickly, it seems that the #ifdefs do not cover that combination correctly, but I can't figure out how to fix that (the ifdef nesting in that file seems to have grown over time...). By the way: With all the tracing on, make also tries to compile scripts/trace-it.c (with HOSTCC), which also fails if the kernel is cross-compiled in a non-linux development environment (Interix in my case - don't ask me, I didn't vote for it...). Many thanks in advance for any help! Greetings -- Klaus Kusche, DI. Dr.Software Development - Control KEBA AGGewerbepark Urfahr, A-4041 Linz, Austria www.keba.com Firmenbuchgericht Linz FN 184376 t mailto:[EMAIL PROTECTED] Phone: +43 732 7090-23120 Fax: +43 732 7090-63401 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RT 45-01: CF Card read: High latency?
>From the three sources of multi-millisecond latency in my experiments (console messages to dead serial console, USB I/O, CF Card bulk read), I've analyzed one: The latency of around 70 milliseconds in low-priority RT processes when running a "dd if=/dev/hda of=/dev/null" in parallel (where hda is a CF card) is due to readahead. Two indications for that: * Per default, maximum readahead is 128 KB. 70 ms is exactly the time needed to transfer 128 KB (our CF cards have a sustained transfer rate of 1800 KB/sec). * If the readahead constants in the kernel are changed, the observed latencies scale almost linearly. So obviously: * The PIO mode IDE transfers keep the CPU 100 % busy. * No other processes get scheduled between the data transfer of individual "sectors", i.e. there are no holes between the IDE interrupts. However, there is no need to change the readahead mechanism, because even the transfer of a single sector would exceed our latency requirements. So we either need to switch to DMA (which we can't in the short term), or run the time-critical tasks above IDE priority. So the question is, what exactly is the IDE priority? Is the PIO transfer done in the IRQ handler or in a bh? -- Klaus Kusche (Software Development - Control Systems) KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-6301 E-Mail: [EMAIL PROTECTED]WWW: www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
BUG in RT 45-01 when RT program dumps core
When a process running with RT priority dumps core, I get the following BUG: Apr 11 13:44:23 OF455 kern.err kernel: BUG: rtc2:833 RT task yield()-ing! Apr 11 13:44:23 OF455 kern.warn kernel: [] yield+0x61/0x70 (8) Apr 11 13:44:23 OF455 kern.warn kernel: [] coredump_wait+0x79/0xc0 (20) Apr 11 13:44:23 OF455 kern.warn kernel: [] do_coredump+0xf3/0x200 (92) Apr 11 13:44:23 OF455 kern.warn kernel: [] kmem_cache_free+0x49/0x120 (32) Apr 11 13:44:23 OF455 kern.warn kernel: [] atomic_dec_and_spin_lock+0x3b/0x50 (24) Apr 11 13:44:23 OF455 kern.warn kernel: [] __dequeue_signal+0x105/0x160 (20) Apr 11 13:44:23 OF455 kern.warn kernel: [] get_signal_to_deliver+0x334/0x350 (48) Apr 11 13:44:23 OF455 kern.warn kernel: [] do_signal+0x98/0x180 (44) Apr 11 13:44:23 OF455 kern.warn kernel: [] timer_interrupt+0x46/0x70 (108) Apr 11 13:44:23 OF455 kern.warn kernel: [] handle_IRQ_event+0x5b/0xe0 (8) Apr 11 13:44:23 OF455 kern.warn kernel: [] __do_IRQ+0x111/0x190 (48) Apr 11 13:44:23 OF455 kern.warn kernel: [] do_page_fault+0x0/0x530 (16) Apr 11 13:44:23 OF455 kern.warn kernel: [] do_notify_resume+0x37/0x3c (8) Apr 11 13:44:23 OF455 kern.warn kernel: [] work_notifysig+0x13/0x15 (8) Klaus Kusche > Entwicklung Software - Steuerung > Software Development - Control > > KEBA AG > A-4041 Linz > Gewerbepark Urfahr > Tel +43 / 732 / 7090-3120 > Fax +43 / 732 / 7090-6301 > E-Mail: [EMAIL PROTECTED] > www.keba.com > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.11, USB: High latency?
> * kus Kusche Klaus <[EMAIL PROTECTED]> wrote: > >IRQ 7-724 0d..11us : end_8259A_irq (do_hardirq) > >IRQ 7-724 0d..11us!: enable_8259A_irq (do_hardirq) > >IRQ 7-724 0d... 832us : do_hardirq (do_irqd) > >IRQ 7-724 0d... 833us : trace_irqs_on (do_hardirq) > > > mmap-1000 0d.h1 21us : end_8259A_irq (__do_IRQ) > > mmap-1000 0d.h1 22us!: enable_8259A_irq (__do_IRQ) > > mmap-1000 0d.h. 662us : irq_exit (do_IRQ) > > mmap-1000 0d..1 662us : do_softirq (irq_exit) > > > mmap-1000 0d.h.0us : do_IRQ (c012d6d5 7 0) > > mmap-1000 0d.h12us!: mask_and_ack_8259A (__do_IRQ) > > mmap-1000 0d.h1 938us : redirect_hardirq (__do_IRQ) > > mmap-1000 0d.h1 939us : wake_up_process (redirect_hardirq) > > such 'freezes' almost certainly signal some sort of hardware > latency - > some device holding the system bus up during DMA. There is no > algorithmic reason for any of those steps above to take > several hundreds > of microseconds. > > Ingo I asked our hardware team. The hardware has two devices which are in use and capable of busmaster/DMA transfers: The intel e100 ethernet controller and the intel PIIX4 USB controller. The IDE interface is also a busmaster, but there are only PIO IDE devices. I suspect the latter, as USB reads were running in parallel... How many bytes are transferred at most by the USB controller for a single request? How long may this take? Any experiences / opinitions / advices? Moreover, we know from experience that the "WBINDV" instruction (Write back and invalidate CPU cache) can cause such latencies. Does this instruction occur anywhere in Linux? -- Klaus Kusche (Software Development - Control Systems) KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED]WWW: www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.11, USB: High latency?
> > Even when the errors described in my previous mail does not occur, > > massive USB stick transfers cause latencies of 1 to 2 milliseconds, > > which is way too much for realtime control systems. > > do these occur under PREEMPT_RT? If yes, do you get any > useful trace if > you enable all the tracing options but keep wakeup-timing off: > > # CONFIG_WAKEUP_TIMING is not set > CONFIG_PREEMPT_TRACE=y > CONFIG_CRITICAL_PREEMPT_TIMING=y > CONFIG_CRITICAL_IRQSOFF_TIMING=y > CONFIG_CRITICAL_TIMING=y > CONFIG_LATENCY_TIMING=y > CONFIG_LATENCY_TRACE=y > CONFIG_MCOUNT=y > > this should catch any type of preempt-off section, irqs-off and > preempt_disable() alike. (unless the tracer has a bug.) I tried with the trace settings above, but I didn't have good luck: 1.) Tracing slows things down *a lot* (even *before* turned on): * My own rtc test program causes 97 % CPU load when tracing is on. On an RT kernel with tracing off, it consumes just below 20 %, on the standard kernel, it consumes less than 3 %. * With -f 8192, rtc_wakeup does not even start: It blocks the whole system for about 1 minute at "setting up consumer thread". After that, it starts running and produces output, but ends with "ringbuffer full" almost immediately. Surprisingly, when started under the control of "strace", it runs, but reports tons of missed interrupts. With "-f 1024", it works. 2.) Tracing still doesn't seem to catch everything: rtc_wakeup sometimes reported a max. jitter of several milliseconds, but the max trace from /proc/latency_trace showed around 50 microseconds... 3.) However, I got some interesting traces later (see below). They seem to be related to USB read load. 4.) When running rtc_wakeup at low rtpri (e.g. 2(12)), I can still generate very bad "missed interrupts" counts (even on a kernel *without* latency tracing): core dumps: Up to 10 missed interrupts, almost continuously USB reads: Also up to 10 missed interrupts, almost continuously USB reads, mmap load in parallel: Up to 50, almost continuously OOM: Up to 500 CF reads: Up to 600 Moreover, rtc_wakeup shows jitter up to 2500 microseconds Even if there is no special test load running, rtc_wakeup now and then misses an interrupt! All the load is generated at shell priority, not at rt priority! (so, ideally, it should not influence rt threads *at all*) 5.) I still do not understand the effect of "chrt"-ing "IRQ 8" to 95: "IRQ 8" is at 49 per default, and there is no process with rtpri>49 (except for rtc_wakeup itself). Hence, to my understanding, there should not be *any* difference between running "IRQ 8" at 49 or 95! However, when "IRQ 8" is at 95, a dd from the CF card doesn't hurt much. When "IRQ 8" is at 49, the dd causes jitter of several milliseconds! (both measured with rtc_wakeup at 89(99)). Why? "IRQ 14" (IDE) is at 47! There is nothing except rtc_wakeup which could block or preempt the RTC "IRQ 8"? preemption latency trace v1.1.4 on 2.6.12-rc1-RT-V0.7.42-08 latency: 833 us, #7/7, CPU#0 | (M:rt VP:0, KP:1, SP:1 HP:1 #P:1) - | task: IRQ 7-724 (uid:0 nice:-5 policy:1 rt_prio:46) - _--=> CPU# / _-=> irqs-off | / _=> need-resched || / _---=> hardirq/softirq ||| / _--=> preempt-depth / | delay cmd pid | time | caller \ /| \ | / IRQ 7-724 0d...0us : do_hardirq (do_irqd) IRQ 7-724 0d..10us : note_interrupt (do_hardirq) IRQ 7-724 0d..11us : end_8259A_irq (do_hardirq) IRQ 7-724 0d..11us!: enable_8259A_irq (do_hardirq) IRQ 7-724 0d... 832us : do_hardirq (do_irqd) IRQ 7-724 0d... 833us : trace_irqs_on (do_hardirq) Some more of those: preemption latency trace v1.1.4 on 2.6.12-rc1-RT-V0.7.42-08 latency: 675 us, #47/47, CPU#0 | (M:rt VP:0, KP:1, SP:1 HP:1 #P:1) - | task: ksoftirqd/0-2 (uid:0 nice:-10 policy:0 rt_prio:0) - _--=> CPU# / _-=> irqs-off | / _=> need-resched || / _---=> hardirq/softirq ||| / _--=> preempt-depth / | delay cmd pid | time | caller \ /| \ | / mmap-1000 0d...0us : common_interrupt ((0)) mmap-1000 0d.h.0us : do_IRQ (c028e90e 0 0) mmap-1000 0d.h11us+: mask_and_ack_8259A (__do_IRQ) mmap-1000 0d.h15us : redirect_hardirq (__do_IRQ) mmap-1000 0d.h.5us : handle_IRQ_event (__do_IRQ) mmap-1000 0d.h.6us : timer_interrupt (handle_IRQ_event) mmap-1000 0d.h16us+: mark_offset_tsc (timer_interrupt) mmap-1000 0d.h1 14us : do_timer (timer_interru
RE: 2.6.11, IDE: Strange scheduling behaviour: high-pri RT process not scheduled?
> > The following tests are made with 'IRQ 8' at 95, rtc_wakeup > at 89(99): > > * Heavy mmap load, no oom: max jitter: 42.1% ( 51 usec) > > * Heavy mmap load, oom:max jitter: 11989.2% (14635 usec) > > (but still "missed irqs: 0", so IRQ 8 was also blocked for 14 ms) > > did you get any kernel messages in that time? (about missed > irqs, etc.) > Please do a 'dmesg -n 0' to minimize the effect of kernel messages. Excellent, thanks! It turned out that the latencies are not caused by the kernel messages themselves, but by sending them to a serial console (which was off), in all my high latency cases at rtpri 89(99). After removing the serial console from the boot parameters, * the OOM timings are back to normal (around 50 microseconds) * the USB error and remove timings are back to normal * the USB plugin timings are in the range of the USB read (which is up to 1 ms - still bad) However, latencies at rtprio 2 are still very frustrating (details will follow). -- Klaus Kusche Entwicklung Software - Steuerung Software Development - Control KEBA AG A-4041 Linz Gewerbepark Urfahr Tel +43 / 732 / 7090-3120 Fax +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED] www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.11, USB: High latency?
> > > The latencies are almost certainly caused by the USB host > controller > > > driver. I'm planning improvements to uhci-hcd which should > > > help reduce > > > the latency, but it will still be on the large side. And I > > > won't have > > > time to write the changes to the driver for several months. > > > > Any numbers about the expected "large side"? > > We would need <30 microseconds irq latency, > > and <<1 milliseconds rt application latency. > > The biggest advantage would come from using a bottom-half > handler to do > most of the work. Right now the uhci-hcd driver does > everything in its > interrupt handler. This would certainly help IRQ latency; it > might not > affect application latency very much. Sounds very reasonable, thanks. Also helps application latency, because with the RT patches, I can tune the rt prio of softirq execution (that's where bottom-half goes, doesn't it?) w.r.t. the rt prio of the application threads. However, if I understand things correctly, if you really need to disable all interrupts while doing the USB work, it will not make any difference if IRQs are disabled while you are in the USB IRQ handler, or if they are disabled for the same amount of work/time in the bottom-half code. > I'll try adding a bottom half in my next series of patches. > Maybe it will > be ready in time to appear in the -mm kernels before 2.6.12-final is > released. > We'll see what happens with the upcoming changes. Maybe > you'll be able to > test them for me? Basically, yes (as long as our company doesn't decide to stop the linux experiments). However, I depend on Ingo's RT patch, which is against the -rc series, not against the -mm series. So I will probably not be able to apply patches created against -mm. -- Klaus Kusche Entwicklung Software - Steuerung Software Development - Control KEBA AG A-4041 Linz Gewerbepark Urfahr Tel +43 / 732 / 7090-3120 Fax +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED] www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [BUG] 2.6.11: Random SCSI/USB errors when reading from USB memory stick
> > > Latency is the subject of a separate email. Does this > > > increase in latency > > > occur only when you see the errors, or whenever you do a > large data > > > transfer? In fact, I would suspect the errors to _decrease_ > > > the latency > > > with respect to a normal transfer. > > > > I observe from <1 to 2 ms on successful transfers, and around 15 ms > > latency when things go wrong. > > I hate to ask this question; it sounds an awful lot like > "Monty Python and > the Holy Grail", but... Is that IRQ latency or application latency? With Ingo's RT kernels, it is not possible for me to tell the difference, because IRQ handlers are also running as kernel threads, scheduled by the scheduler according to their rt prio (which is around 50 by default, 95 for the RTC IRQ in my case). I can tell for sure that the RTC IRQ handler was not executed in that time, but I can't tell if that's because IRQs were blocked or because it didn't get scheduled. > I can't think of any reason why IRQ latency should change > during the error > handling. Application latency might change because the SCSI > error handler > could start using up a lot of CPU time. I don't know what > priority it > runs at; you can check with ps. Ingo just suggested that error handling (writing kernel messages to console/disk) in general is causing the trouble. I'll check that and post the results. -- Klaus Kusche Entwicklung Software - Steuerung Software Development - Control KEBA AG A-4041 Linz Gewerbepark Urfahr Tel +43 / 732 / 7090-3120 Fax +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED] www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.11, IDE: Strange scheduling behaviour: high-pri RT process not scheduled?
> getting /dev/rtc handling right for latency measurement is > ... tricky. > The method i'm using under PREEMPT_RT is: > > chrt -f 84 -p `pidof 'IRQ 0'` > chrt -f 95 -p `pidof 'IRQ 8'` > ./rtc_wakeup -f 1024 -t 10 I tried it your way. First impressions with realtime-preempt-2.6.12-rc1-V0.7.42-01, rtc_wakeup and chrt: The main difference is not between rtc_wakeup and my test script (the timings agree very well), but comes from chrt for 'IRQ 8': System not loaded, original 'IRQ 8' prio 49: * rtc_wakeup at 89(99): max jitter: 378.2% (461 usec), missed irqs: 0 * rtc_wakeup at 1(11): max jitter: 97.1% (118 usec), missed irqs: 25 System not loaded, 'IRQ 8' prio at 95: * rtc_wakeup at 89(99): max jitter: 19.1% ( 23 usec), missed irqs: 0 * rtc_wakeup at 1(11): max jitter: 190.7% (232 usec), missed irqs: 52 The following tests are made with 'IRQ 8' at 95, rtc_wakeup at 89(99): * Heavy mmap load, no oom: max jitter: 42.1% ( 51 usec) * Heavy mmap load, oom:max jitter: 11989.2% (14635 usec) (but still "missed irqs: 0", so IRQ 8 was also blocked for 14 ms) * USB reads, no error: max jitter:707.7% ( 863 usec) (happens rarely) * USB reads, error:max jitter: 7247.2% ( 8846 usec) (again, no missed irqs) * USB stick connect: max jitter: 5776.7% ( 7051 usec) * USB stick disconnect:max jitter: 2966.6% ( 3621 usec) (both without error!) Notes: * My system does not have a 'IRQ 0' kernel thread. It has IRQ 1, 3, 7, 8, 12 and 14. This is strange, because /proc/interrupts says 0:4805658 XT-PIC timer 0/5658 2: 0 XT-PIC cascade 0/0 3: 0 XT-PIC serial 0/0 7: 17538 XT-PIC uhci_hcd:usb1, eth0 0/17538 8: 838523 XT-PIC rtc 0/38523 14: 5775 XT-PIC ide0 0/5775 * The syntax of chrt has changed: chrt -f -p * I run rtc_wakeup with -f 8192. This is more realistic in our case. CF card (IDE) load does not hurt as long as the rtc test runs with a very high rtpri: The results did not differ significantly from no-load results. Attempts with tracing on and results with running the test at low rtpri will follow tomorrow... -- Klaus Kusche Entwicklung Software - Steuerung Software Development - Control KEBA AG A-4041 Linz Gewerbepark Urfahr Tel +43 / 732 / 7090-3120 Fax +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED] www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.11, USB: High latency?
> I couldn't find that previous email in the MARC archives. > > Regardless, you'd have to provide a small bit of information about > your hardware configuration. What device speed: full or high? > What controller: EHCI, OHCI, UHCI, something else? Which driver > for the stick: usb-storage, or ub? What else was using memory > and PCI bandwidth at the time? SMP? The error occurred on an intel Pentium 3 (500 MHz) embedded system with 440BX chipset and 192 MB RAM. USB is handled by the 440BX (intel 82371 PIIX4). The UHCI driver shares interrupt 7 with an intel 82559ER 100 Mbit ethernet controller (which is driven by the e100 driver and active: As there is no local keyboard, I access the system by ssh). The system "disk" is a 128 MB CF card directly connected to the 440BX primary IDE port and running in PIO mode 2 at about 2 MB/sec peak (but it is idle most of the time). There is a SM712 VGA chip running in text mode, a 1000 HZ std PC timer, and no other "interesting" device (nothing else on the PCI bus or causing any interrupts). The error was reproduced with statically linked (no modules) vanilla-2.6.11, 2.6.11-gentoo-r3, and realtime-preempt-2.6.12-rc1-V0.7.41-11 kernels, all built with gcc 3.4.3. No SMP. USB-storage for the sticks. I tried with two different sticks (an old 64 MB USB 1.x and a 1 GB USB 2.x), both show the same problem on all USB interfaces on the target. The same dd command works fine on both sticks on my office PC. -- Klaus Kusche Entwicklung Software - Steuerung Software Development - Control KEBA AG A-4041 Linz Gewerbepark Urfahr Tel +43 / 732 / 7090-3120 Fax +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED] www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [BUG] 2.6.11: Random SCSI/USB errors when reading from USB memory stick
> Latency is the subject of a separate email. Does this > increase in latency > occur only when you see the errors, or whenever you do a large data > transfer? In fact, I would suspect the errors to _decrease_ > the latency > with respect to a normal transfer. I observe from <1 to 2 ms on successful transfers, and around 15 ms latency when things go wrong. I do not know what the kernel does internally, but I tried block sizes from 512b to 8192kb with dd, and it did not make any difference, neither w.r.t. timing nor w.r.t. errors. > This indicates that the problem lies in the computer, or its USB > connectors and their electromagnetic environment. Hmmm, hard to believe. It happens on all USB ports, on two different controllers. The box is an industrial-grade embedded control system, in a robust steel case. There is no external USB cable involved, the stick is plugged directly into the box. The USB ports of these boxes work fine in vxWorks (don't know about Windows). > > I suspect some kind of timing / bus / interrupt trouble, > either due to > > the interrupt shared between uhci and e100, or due to the > extremely slow > > PIO transfers (which seem to block interrupts). > > Timing or interrupt trouble wouldn't cause the problems you see. Bus > utilization can cause problems, but they would be reported > differently. > I think your problem is due either to faulty hardware or to bad signal > propagation. If it's not caused by bus utilization or shared or lost irqs, I think it is some kind of software problem, maybe in the driver, maybe elsewhere: The dd command *always* runs fine for some time, up to now the transfer never had any problems for the first 30 MB or so, sometimes even for 500 MB on my 1 GB stick. Is there any possibility for integer overflow? Is there any data which may accumulate over time? > > What can I do to solve or further analyze the problem? > > Replace the USB cable. Use a separate PCI USB controller > card, preferably > one with a high-speed USB controller A cable is not involved, and the embedded system doesn't offer PCI slots. > > Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: *** thread > > awakened. > > Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: > Command READ_10 > > (10 bytes) > > Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: 28 > 00 00 01 bd c0 > > 00 00 f0 00 > > Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: Bulk Command S > > 0x43425355 T 0x38e L 122880 F 128 Trg 0 LUN 0 CL 10 > > Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: > > usb_stor_bulk_transfer_buf: xfer 31 bytes > > Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: Status code 0; > > transferred 31/31 > > Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: -- transfer > > complete > > Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: Bulk command > > transfer result=0 > > Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: > > usb_stor_bulk_transfer_sglist: xfer 122880 bytes, 30 entries > > Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: > Status code -75; > > transferred 19840/122880 > > Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: -- babble > > This is not a "short read"; as the log says, it's a "babble". > That means > the computer's USB controller believed the device continued > transmitting > past the time when it was supposed to stop. Assuming the device is > functioning correctly, either the USB controller is bad or > the signal was > degraded. But doesn't it say 19840 of 122880 transferred? In all cases of trouble, the first number is smaller than the second. -- Klaus Kusche Entwicklung Software - Steuerung Software Development - Control KEBA AG A-4041 Linz Gewerbepark Urfahr Tel +43 / 732 / 7090-3120 Fax +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED] www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.11, USB: High latency?
> The latencies are almost certainly caused by the USB host controller > driver. I'm planning improvements to uhci-hcd which should > help reduce > the latency, but it will still be on the large side. And I > won't have > time to write the changes to the driver for several months. Any numbers about the expected "large side"? We would need <30 microseconds irq latency, and <<1 milliseconds rt application latency. > The best solution is to stop using uhci-hcd. Get a PCI card > with an OHCI > or EHCI (high-speed) controller. They do much more work in hardware, > reducing the amount of time the driver needs to spend with interrupts > disabled. The hardware is invariable. It is an embedded system with no PCI slots. And it seems to be possible with UHCI, because vxWorks allows USB stick transfers in operation without missing latency requirements. I do not require rt on the USB, it may block its own irq as long as it likes, but it should not affect other irqs. -- Klaus Kusche Entwicklung Software - Steuerung Software Development - Control KEBA AG A-4041 Linz Gewerbepark Urfahr Tel +43 / 732 / 7090-3120 Fax +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED] www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.11, IDE: Strange scheduling behaviour: high-pri RT process not scheduled?
> > The test system runs a 2.6.11 kernel (no SMP) on a Pentium3 500 MHz > > embedded hardware. > > which probably has memory bandwidth of at most a couple hundred MB/s, > which is really horrible by modern standards. It's my job to find out if linux can be used for control systems on this (and smaller: ARM, embedded PPC) hardware. It's not my job to discuss the hardware. This hardware is quite fast for embedded control system standards, where product lifetimes exceed 10 years. AFAIK, intel does not offer any P4 products with industrial availability and lifetime gurantee (and they would run too hot anyway). > > However, things break seriously when exercising the CF card > in parallel > > (e.g. with a dd if=/dev/hda of=/dev/null): > > > > * The rtc *interrupt handler* is delayed for up to 250 > *micro*seconds. > > This is very bad for my purpose, but easy to explain: It is > roughly the > > time needed to transfer 512 Bytes from a CF card which can > transfer 2 > > Mbyte/sec, and obviously, the CPU blocks all interrupts > while making pio > > > > transfers. (Why? Is this really necessary?) > > even with -u1, isn't there still a softirq queue that will > delay the wakeup > of your user-space tester? I assume your comment belongs to the timings of my test program (below), not to the timings of the rtc irq handler (above): The softirq queue should not block the rtc irq handler I think. The softirq queue should not even block user-space rt programs for 60-300 ms. If something takes that long, it should perhaps be implemented as threads with a well-known rtprio, not in the softirq queue. > > * The *test program* is regularly blocked for 63 *milli*seconds, > > sometimes for up to 300 *milli*seconds, which is absolutely > > unacceptable. > > guessing that's VM housekeeping. I would feel better if I knew instead of guessing. Moreover, VM housekeeping caused by non-rtpri programs should not block memory-resident rt programs. And the VM housekeeping in this case is just too simple to waste milliseconds on it: Just take one page of free mem (there is plenty of it) and use it as a buffer... > > Now the big question: > > *** Why doesn't my rt test program get *any* CPU for 63 jiffies? *** > > (the system ticks at 1000 HZ) > > because it's user-space. the 'rt' is a bit of a misnomer - > it's merely > a higher priority, less preemptable job. > > > * The dd program obviously gets some CPU regularly (because > it copies 2 > > MB/s, and because no other program could cause the 1 % user > CPU load). > > The dd runs at normal shell scheduling priority, so it should be > > preempted immediately by my test program! > > out of curiosity, does running it with "nice -n 19" change anything? no. > > 2.) Using a realtime-preempt-2.6.12-rc1-V0.7.41-11 kernel with > > PREEMPT_RT: > > If my test program runs at rtpri 99, the problem is gone: It is > > scheduled within 30 microseconds after the rtc interrupt. > > If my test program runs at rtpri 1, it still suffers > from delays > > in the 30 to 300 millisecond range. > > so your problem is solved, no? No, running at an rtpri above the irq handlers is not an option in practice: My program would block irq handling... > also, did you try a (plain) preemptable kernel? What I called "vanilla" was configured with "plain" preemption. -- Klaus Kusche Entwicklung Software - Steuerung Software Development - Control KEBA AG A-4041 Linz Gewerbepark Urfahr Tel +43 / 732 / 7090-3120 Fax +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED] www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-25
> i have released the -V0.7.41-25 Real-Time Preemption patch, > which can be > downloaded from the usual place: 1. Does not compile without RT_DEADLOCK_DETECT: kernel/rt.c: In function `change_owner': kernel/rt.c:556: error: structure has no member named `debug' kernel/rt.c: In function `pi_setprio': kernel/rt.c:576: error: structure has no member named `debug' kernel/rt.c: In function `task_blocks_on_lock': kernel/rt.c:677: error: structure has no member named `debug' kernel/rt.c:687: error: structure has no member named `debug' kernel/rt.c: In function `__up_mutex': kernel/rt.c:1223: error: structure has no member named `debug' 2. My problem (see my LKML mails yesterday) is not yet solved: The latency tracer shows latencies of at most 40 microseconds, but my test program at rtprio 99 sometimes did not get any CPU for milliseconds... -- Klaus Kusche Entwicklung Software - Steuerung Software Development - Control KEBA AG A-4041 Linz Gewerbepark Urfahr Tel +43 / 732 / 7090-3120 Fax +43 / 732 / 7090-8919 E-Mail: [EMAIL PROTECTED] www.keba.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11, USB: High latency?
I'm performing realtime latency tests (for details about the hardware and software, see my mail "[BUG] 2.6.11: Random SCSI/USB errors when reading from USB memory stick" erlier today). Even when the errors described in my previous mail does not occur, massive USB stick transfers cause latencies of 1 to 2 milliseconds, which is way too much for realtime control systems. I observe these latencies on a vanilla 2.6.11 at any rtprio (even 99), and on realtime-preempt-2.6.12-rc1-V0.7.41-11 at low rtprio (1). When running the program on realtime-preempt-2.6.12-rc1-V0.7.41-11 with rtprio 99, the latencies are gone, but using a rtprio higher than the interrupt handlers is not realistic. Is there anything which can be done about it? Klaus Kusche > Entwicklung Software - Steuerung > Software Development - Control > > KEBA AG > A-4041 Linz > Gewerbepark Urfahr > Tel +43 / 732 / 7090-3120 > Fax +43 / 732 / 7090-8919 > E-Mail: [EMAIL PROTECTED] > www.keba.com > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PREEMPT_RT: Undetected latencies?
I'm experimenting with a realtime-preempt-2.6.12-rc1-V0.7.41-11 kernel, preemption fully enabled, and a small RTC test program (see my previous mail for details). If my program runs at rtprio 99, most of the time everything is fine: * My program records intervals of 122 +- 40 microseconds (which corresponds nicely to the 8192 Hz RTC interrupt). * Both the kernel's RTC histogram and the kernel's RT latency trace stay below 40 microseconds. However, during extensive testing, I found some cases which extend the intervals unacceptably: 1.) When a test program performs massive mmaps (at base shell prio) and activates the oom killer, my program measures intervals in the 15 millisecond range. 2.) When I get the USB errors described in a separate mail earlier today, I also get intervals from 10 to 15 milliseconds. (a combination of heavy USB and CF card traffic also seems to cause blackouts in the 1 ms range even for rtprio 99, but I have to look into that again) Surprisingly, in both cases the worst latencies traced by the RT latency trace and the RTC histogram are still less than 40 microseconds! Moreover, the values read from /dev/rtc do not indicate any lost interrupts. This seems to indicate that neither the RTC IRQ handler itself nor timers were executed during these 15 millisecond period, otherwise, either the RTC IRQ handler or the rtc_dropped_irq watchdog would have counted lost interrupts! It seems that something locked up the system completely for 15 ms: No interrupts at all, no scheduling, ... Now, there are several questions: * What is causing these 15 ms latencies in the two cases above, and how can they be fixed? * Why aren't they detected at all by any traces? Are these classical spinlocks which aren't instrumented? Are they explicit cli/sti in device drivers? Any other possibilities? How can this be traced and analyzed? A simple explanation would be that the TSC is reset. Is there any code in the kernel messing with the TSC? If my program runs at rtprio 1 instead of rtprio 99, many things cause long intervals: * normal working load (up to 1.2 ms) * heavy mmap load without oom (many samples increased by 0.2 ms) * heavy USB stick I/O without errors (up to 3 ms) * core dumps (up to 10 ms) * heavy USB stick I/O with errors (up to 15 ms) * oom killer (up to 100 ms) * heavy CF card I/O (up to 300 ms, see separate mail) Even on an idle system, I see 1 ms intervals now and then! Again, the latency traces are in the range of 40 to 50 microseconds only, so these delays are caused by something not covered by the latency tracer. How to find that out? One more question: When my test program runs at rtprio 1, there are huge delays (up to 100 ms) between the moment the rtc read gets ready and the moment the program is actually scheduled. However, even then the wakeup latency trace is not showing it. What is actually measured by the wakeup latency trace? Thanks for any help! Klaus Kusche > Entwicklung Software - Steuerung > Software Development - Control > > KEBA AG > A-4041 Linz > Gewerbepark Urfahr > Tel +43 / 732 / 7090-3120 > Fax +43 / 732 / 7090-8919 > E-Mail: [EMAIL PROTECTED] > www.keba.com > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.11, IDE: Strange scheduling behaviour: high-pri RT process not scheduled?
> From: Bartlomiej Zolnierkiewicz [mailto:[EMAIL PROTECTED] > > On Wed, 30 Mar 2005 13:52:05 +0200, kus Kusche Klaus > <[EMAIL PROTECTED]> wrote: > > However, things break seriously when exercising the CF card > in parallel > > (e.g. with a dd if=/dev/hda of=/dev/null): > > > > * The rtc *interrupt handler* is delayed for up to 250 > *micro*seconds. > > This is very bad for my purpose, but easy to explain: It is > roughly the > > time needed to transfer 512 Bytes from a CF card which can > transfer 2 > > Mbyte/sec, and obviously, the CPU blocks all interrupts > while making pio > > > > transfers. (Why? Is this really necessary?) > > > > (I know because I instrumented the rtc irq handler with > rdtscl(), too) > > hdparm -u1 /dev/hda > > should help Hmmm, thanks, that sounds very reasonable, and I didn't know that flag. Unfortunately, it doesn't help. The bad timings stay the same (still delays in the 30-300 ms range), the number of context switches stays the same, ... The only thing which changes is the CPU load shown by vmstat: * With -u0, I have 1 % user, ~50 % sys, ~50 % wa * With -u1, I have 1 % user, ~98 % sys, 1 % wa P.S.: Apologies for my badly formatted mails. The company forces us to use outlook, we may not even change the settings... Klaus Kusche >Entwicklung Software - Steuerung >Software Development - Control > >KEBA AG >A-4041 Linz >Gewerbepark Urfahr >Tel +43 / 732 / 7090-3120 >Fax +43 / 732 / 7090-8919 >E-Mail: [EMAIL PROTECTED] >www.keba.com > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11, IDE: Strange scheduling behaviour: high-pri RT process not scheduled?
I've written a small test program which enables periodic RTC interrupts at 8192 Hz and then goes into a loop reading /dev/rtc and collecting timing statistics (using the rdtscl macro). The program runs at highest realtime scheduling priority (99) with memory locked. In the loop, it doesn't do any I/O except for /dev/rtc reads, no memory allocations, nothing which could block or take time: rdtscl(old); for (i = 0; i < SAMPLES; ++i) { if (read(fd, &rtcval, sizeof (unsigned long)) != sizeof (unsigned long)) { fprintf(stderr, "%s: reading rtc data failed: %s\n", argv[0], strerror(errno)); exit(1); } rtcval >>= 8; --rtcval; if (rtcval >= INTERRUPTS) ++intr[INTERRUPTS]; else ++intr[rtcval]; rdtscl(new); delta = new - old; delta /= RESOLUTION; if (delta >= INTERVALS) ++cnt[INTERVALS]; else ++cnt[delta]; old = new; } The test system runs a 2.6.11 kernel (no SMP) on a Pentium3 500 MHz embedded hardware. All filesystems are on a CF Card connected to the primary IDE controller (intel 440BX chipset) in PIO mode 2. The CF card can transfer up to 2 MB/s. There are 192 MB RAM, most of it free. The system is configured without swap space. As long as the system is idle otherwise, my program works as expected: Most of the time, after an rtc interrupt the program gets the data from /dev/rtc within less than 50 microseconds. Even with the test program running (which causes >16000 context switches per second), vmstat shows more than 95 % idle CPU. Hence, the CPU needed by the rtc irq handler and the test program is minimal. However, things break seriously when exercising the CF card in parallel (e.g. with a dd if=/dev/hda of=/dev/null): * The rtc *interrupt handler* is delayed for up to 250 *micro*seconds. This is very bad for my purpose, but easy to explain: It is roughly the time needed to transfer 512 Bytes from a CF card which can transfer 2 Mbyte/sec, and obviously, the CPU blocks all interrupts while making pio transfers. (Why? Is this really necessary?) (I know because I instrumented the rtc irq handler with rdtscl(), too) * The *test program* is regularly blocked for 63 *milli*seconds, sometimes for up to 300 *milli*seconds, which is absolutely unacceptable. * vmstat shows around 50 % sys CPU and around 50 % I/O wait, and 0 or 1 % user CPU (zero idle CPU). context switches are down to around 500 from the expected >16000, which also indicates that my program is scheduled much less often than it should. Most of the time, vmstat shows one running and one blocked process. Now the big question: *** Why doesn't my rt test program get *any* CPU for 63 jiffies? *** (the system ticks at 1000 HZ) * There is around 50 % idle CPU (I/O wait). I/O wait should not block my program, because my program neither performs any disk I/O nor swaps. * The dd program obviously gets some CPU regularly (because it copies 2 MB/s, and because no other program could cause the 1 % user CPU load). The dd runs at normal shell scheduling priority, so it should be preempted immediately by my test program! Some things I tried: 1.) Using a mdma2 instead of a pio2 CF card: The CF card speed doubles (4 MB/s instead of 2 MB/s), and the "blind time" of my test program halves from 63 ms to 32 ms, but basically, the problem remains. 2.) Using a realtime-preempt-2.6.12-rc1-V0.7.41-11 kernel with PREEMPT_RT: If my test program runs at rtpri 99, the problem is gone: It is scheduled within 30 microseconds after the rtc interrupt. If my test program runs at rtpri 1, it still suffers from delays in the 30 to 300 millisecond range. Thanks for any help! Klaus Kusche > Entwicklung Software - Steuerung > Software Development - Control > > KEBA AG > A-4041 Linz > Gewerbepark Urfahr > Tel +43 / 732 / 7090-3120 > Fax +43 / 732 / 7090-8919 > E-Mail: [EMAIL PROTECTED] > www.keba.com > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG] 2.6.11: Random SCSI/USB errors when reading from USB memory stick
When trying a "dd if=/dev/sda of=/dev/null" (where /dev/sda is an USB memory stick), this works fine for some seconds (and actually transfers data at around 700-1000 KB/s), but ends up with some I/O errors sooner or later, which cause the device to go offline (the stick must be replugged to make it accessible again). Sometimes, this causes kernel bugchecks and hung "dd" processes (which cannot be terminated even with kill -9), see second example. Two examples (syslog message outputs) are appended below, in both cases, short USB reads seem to initiate the trouble. I also tried running a kernel latency test program simultaneously: When the errors occur, the latency goes up from less than 1 millisec to about 15 millisec (very bad for embedded control systems...). The error occurred on an intel Pentium 3 (500 MHz) embedded system with 440BX chipset and 192 MB RAM. USB is handled by the 440BX (intel 82371 PIIX4). The UHCI driver shares interrupt 7 with an intel 82559ER 100 Mbit ethernet controller (which is driven by the e100 driver and active: As there is no local keyboard, I access the system by ssh). The system "disk" is a 128 MB CF card directly connected to the 440BX primary IDE port and running in PIO mode 2 at about 2 MB/sec peak (but it is idle most of the time). There is a SM712 VGA chip running in text mode, a 1000 HZ std PC timer, and no other "interesting" device (nothing else on the PCI bus or causing any interrupts). The error was reproduced with statically linked (no modules) vanilla-2.6.11, 2.6.11-gentoo-r3, and realtime-preempt-2.6.12-rc1-V0.7.41-11 kernels, all built with gcc 3.4.3. I tried with two different sticks (an old 64 MB USB 1.x and a 1 GB USB 2.x), both show the same problem on all USB interfaces on the target. The same dd command works fine on both sticks on my office PC. I suspect some kind of timing / bus / interrupt trouble, either due to the interrupt shared between uhci and e100, or due to the extremely slow PIO transfers (which seem to block interrupts). What can I do to solve or further analyze the problem? Thanks! *** First example: ... trouble starts here: Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: *** thread awakened. Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: Command READ_10 (10 bytes) Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: 28 00 00 01 bd c0 00 00 f0 00 Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: Bulk Command S 0x43425355 T 0x38e L 122880 F 128 Trg 0 LUN 0 CL 10 Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: Status code 0; transferred 31/31 Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: -- transfer complete Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: Bulk command transfer result=0 Mar 30 10:47:16 OF455 kern.debug kernel: usb-storage: usb_stor_bulk_transfer_sglist: xfer 122880 bytes, 30 entries Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: Status code -75; transferred 19840/122880 Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: -- babble Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: Bulk data transfer result 0x3 Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: Attempting to get CSW... Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: Status code -75; transferred 13/13 Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: -- babble Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: Bulk status result = 3 Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: -- transport indicates error, resetting Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: usb_stor_Bulk_reset called Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: usb_stor_control_msg: rq=ff rqtype=21 value= index=00 len=0 Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: Soft reset failed: -32 Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: scsi cmd done, result=0x7 Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: queuecommand called Mar 30 10:47:17 OF455 kern.debug kernel: usb-storage: *** thread sleeping. ... this error is repeated several times before being reported to the SCSI layer... Mar 30 10:47:17 OF455 kern.info kernel: SCSI error : <0 0 0 0> return code = 0x7 Mar 30 10:47:17 OF455 kern.warn kernel: end_request: I/O error, dev sda, sector 114112 Mar 30 10:47:17 OF455 kern.err kernel: Buffer I/O error on device sda, logical block 57056 ... the same for block 57057, 57058, 57059 Mar 30 10:47:18 OF455 kern.info kernel: SCSI error : <0 0 0 0> return code = 0x7 Mar 30 10:47:18 OF455 kern.warn kernel: end_request: I/O error, dev sda, sector 114118 Mar 30 10:47:18 OF455 kern.err kernel: Buffer I/O error on device sda, logical block 57059 ... now the device goes offline: Mar 30 10:47:18 OF455 kern.debug kernel: usb-storage: *** thread awakened. Mar 30 10:47:18 OF455